wheaton5 / souporcell

Clustering scRNAseq by genotypes
MIT License
160 stars 46 forks source link

doublet detection errors with troublet: non-zero exit status and index out of bounds #231

Open shirshuj opened 4 months ago

shirshuj commented 4 months ago

I completed the tutorial and two samples of my own without error but now I'm consistently getting this problem with subsequent samples:

running souporcell doublet detection
Traceback (most recent call last):
  File "/opt/souporcell/souporcell_pipeline.py", line 596, in <module>
    doublets(args, ref_mtx, alt_mtx, cluster_file)
  File "/opt/souporcell/souporcell_pipeline.py", line 541, in doublets
    subprocess.check_call([directory+"/troublet/target/release/troublet", "--alts", alt_mtx, "--refs", ref_mtx, "--clusters", cluster_file], stdout = dub, stderr = err)
  File "/usr/local/envs/py36/lib/python3.6/subprocess.py", line 311, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/opt/souporcell/troublet/target/release/troublet', '--alts', 'S8C_souporcell_2/alt.mtx', '--refs', 'S8C_souporcell_2/ref.mtx', '--clusters', 'S8C_souporcell_2/clusters_tmp.tsv']' returned non-zero exit status 101.

I checked the doublets.err file which says:

thread 'main' panicked at 'index out of bounds: the len is 0 but the index is 95256', src/main.rs:328:22
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

But I'm not sure how to correct this issue. Any advice or things to try is appreciated. Thank you!

wheaton5 commented 4 months ago

Probably upstream of doublet detection if len is 0. Can you give me the first few lines of alt.mtx and the cluster_tmp.tsv?

shirshuj commented 4 months ago

First lines of alt.mtx

%%MatrixMarket matrix coordinate real general
% written by sprs
14492 6794880 457068
1 95257 1
1 337008 1
1 1023389 1
1 1158281 1
1 2192702 1
1 3038255 1
1 3261750 1
1 3548729 1
1 3729811 1
1 3736516 1
1 3893785 1
1 4267166 1
1 4336965 1

But cluster_tmp.tsv is empty

wheaton5 commented 4 months ago

Right so u are using the unfiltered barcodes file. Only use barcodes that are cells according to cellranger. This file should be in outs/filtered/ or something like that

shirshuj commented 4 months ago

The filepath is going to the filtered ones for each sample but maybe something got corrupted in them. I'll try some different samples to see if they will run and I will try taking a closer look at the filtered barcodes files that aren't working.

wheaton5 commented 4 months ago

Well that alt.mtx says you have 6.79 million cells

shirshuj commented 4 months ago

That doesn't sound right at all but I'm looking into it. Thanks for your help in figuring out where I need to look for troubleshooting these runs!

megannguyen1009 commented 1 week ago

I'm having this same exact error. Did you ever figure out how to fix it?

wheaton5 commented 1 week ago

Use the filtered barcodes not the raw barcodes

wheaton5 commented 1 week ago

In cellranger output there is a filtered folder and raw folder. Get it from the filtered folder

megannguyen1009 commented 1 week ago

Hello thank you for your quick reply. My barcodes.tsv file is from the filtered folder, and I still got those errors. Do you know any other possibilities that could be causing this error?

wheaton5 commented 1 week ago

Give me the first 3 lines of alt.mtx

megannguyen1009 commented 1 week ago

My first few lines of alt.mtx

%%MatrixMarket matrix coordinate real general % written by sprs 2112270 2035673 50014710 1 64536 1 1 140738 0 1 239400 0 1 251279 1 1 254260 0 1 279530 0 1 518165 0

wheaton5 commented 1 week ago

Right. This is due to an unflitered barcodes file. This says u have 2 million cells