Open Rcheng0731 opened 1 year ago
I need more info. Its like all cluster centers were initialized as the same value. But u couldnt get the code to do that without editing it. Im not sure how this could happen. Can u tell me more about the data, preprocessing, and the souporcell command line?
This is my original data format. I try to convert them to souporcell input format.
"AAAAAAAGAACG" "AAAAAAAGTCGT" "AAAAAAATGAAT" "AAAAAAATTATA" "AAAAAACCCCCG" "AAAAAACTGTTA" "AAAAAATCCAGA" "AAAAACACACAG" "AAAAACATCCTA" "AAAAACATTCAG" "AAAAACCCAACT"
"1_rG_A" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
"3_rT_A" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
"4_rC_A" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
"6_rC_A" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
$ head alt.mtx %%MatrixMarket matrix coordinate real general % written by sprs 16569 42706 164569725 1 145 0 1 156 0 1 390 0 1 402 0 1 419 0 1 484 0 1 510 0
$ head ref.mtx %%MatrixMarket matrix coordinate real general % written by sprs 16569 42706 164569725 1 145 1 1 156 1 1 390 1 1 402 1 1 419 2 1 484 1 1 510 1
I try to just run this step:
souporcell -a alt.mtx -r ref.mtx -b barcodes.tsv -k
I don't know whether it is my data format conversion error or this is not feasible.
I still don’t understand. So this is 16k cells? 42k variants? Also why do the ref.mtx have counts and not alt.mtx? Do later alt entries have non 0 entries
The first column is my gene locus information, which is the human mitochondrial gene, with a total of 16k. The second column is the cell information, with a total of 46k. The later alt item contains non-zero items, but very few, probably because there are few mutations. I don't know if my understanding of alt.mtx and ref.mtx is correct
46k cells already a bit suss. This is gonna be some sparse data. What is the umi per cell?
Still there must be something fundamentally wrong to produce those results. What do u mean when u say there are few mutations? This is for differentiating cells from different individuals. Even related but not identical twins have enough differences to demultiplex them at least when few are mixed.
I need to know what the experiment is, what the samples are, how you are transforming that for souporcell input. Be detailed and maybe i can help.
When I use my own variation data to split the sum, I want to cluster with souporcell, but there are some problems. Can you help me?
Here's my clusters_tmp.tsv thread 3 iteration 12 done with -11048627, best so far -11048627 thread 1 iteration 12 done with -11048627, best so far -11048627 thread 7 iteration 12 done with -11048627, best so far -11048627 binomial 0 12 11 8 -11048627 0 thread 0 iteration 12 done with -11048627, best so far -11048627 binomial 6 12 11 8 -11048627 0 binomial 2 12 11 8 -11048627 0 thread 6 iteration 12 done with -11048627, best so far -11048627 thread 2 iteration 12 done with -11048627, best so far -11048627 .............. AAAAACCGTGGT 0 -234.45059 -234.45059 -234.45059 -234.45059 -234.45059 -234.45059 -234.45059 -234.45059 -234.45059 -234.45059 AAAAACCTAACA 0 -196.18901 -196.18901 -196.18901 -196.18901 -196.18901 -196.18901 -196.18901 -196.18901 -196.18901 -196.18901 AAAAACGCAACG 0 -245.19284 -245.19284 -245.19284 -245.19284 -245.19284 -245.19284 -245.19284 -245.19284 -245.19284 -245.19284 AAAAACGCCAGA 0 -185.21036 -185.21036 -185.21036 -185.21036 -185.21036 -185.21036 -185.21036 -185.21036 -185.21036 -185.21036 AAAAACTAGTAC 0 -229.84436 -229.84436 -229.84436 -229.84436 -229.84436 -229.84436 -229.84436 -229.84436 -229.84436 -229.84436 AAAAACTCCGAC 0 -192.864 -192.864 -192.864 -192.864 -192.864 -192.864 -192.864 -192.864 -192.864 -192.864 AAAAACTGAATG 0 -475.0638 -475.0638 -475.0638 -475.0638 -475.0638 -475.0638 -475.0638 -475.0638 -475.0638 -475.0638 AAAAACTGTGGG 0 -330.14725 -330.14725 -330.14725 -330.14725 -330.14725 -330.14725 -330.14725 -330.14725 -330.14725 -330.14725 AAAAAGATAACT 0 -180.41399 -180.41399 -180.41399 -180.41399 -180.41399 -180.41399 -180.41399 -180.41399 -180.41399 -180.41399 AAAAAGATGCGA 0 -216.50815 -216.50815 -216.50815 -216.50815 -216.50815 -216.50815 -216.50815 -216.50815 -216.50815 -216.50815