Open sanchezy opened 2 years ago
I'll do this as soon as I can. I need a computer with admin access (work computer doesn't have that) and I'm trying to find the charger to my 2011 macbook air lol. I just moved and it wasnt in the same box as the computer... You could run vartrix manually, add the files to that folder as well as a vartrix.done file and then restart the pipeline. It will see the vartrix.done file and go from the next step. Just use the same arguments as in the error message above.
I also keep getting a vatrix crash. However, I cannot find the crash report.
^[[0m^[[0m^[[31mWell, this is embarrassing.
vartrix had a problem and crashed. To help us diagnose the problem you can send us a crash report.
"We have generated a report file at "/tmp/report-cb2042fa-804e-491a-bc56-91f750318372.toml". Submit an issue or email with the subject of "vartrix Crash Report" and include the report as an attachment.
We take privacy seriously, and do not perform any automated error collection. In order to improve the software, we rely on people to submit reports.
Thank you kindly! ^[[0m"
There is no tmp/ directory in the working directory so I am not sure where the report was saved. thanks!
Is there a vartrix.err file?
Yes that is all vartrix.err files contains.
"^[[0m^[[0m^[[31mWell, this is embarrassing.
vartrix had a problem and crashed. To help us diagnose the problem you can send us a crash report.
"We have generated a report file at "/tmp/report-cb2042fa-804e-491a-bc56-91f750318372.toml". Submit an issue or email with the subject of "vartrix Crash Report" and include the report as an attachment.
Authors: Ian Fiddes ian.fiddes@10xgenomics.com, Patrick Marks patrick@10xgenomics.com We take privacy seriously, and do not perform any automated error collection. In order to improve the software, we rely on people to submit reports.
Thank you kindly! ^[[0m"
I cannot locate the report anywhere - at least in directories I have permissions for
It might be something upstream to vartrix and we are giving vartrix bad input. What does the vcf look like? Can you try running vartrix manually?
I think it was my vcf file. It had been aligned to h19 by the sequencing centre and not h38 like my bam file. It is running fine now that I lifted it to h38. Although now I am having a problem with the clustering, but I will open another issue for that. thanks!
I have the same problem here. I have 4 libraries, analyzed with the same CellRanger version and the same reference genome. The analysis is completed for 3 out of 4 samples, while one of them crashed with the same error message
Can u provide the contents of any of the .err files?
vartrix.err
file
vartrix had a problem and crashed. To help us diagnose the problem you can send us a crash report.
We have generated a report file at "/tmp/1555006.1.bigmem.q/report-be2ad8f9-943e-415c-87b3-02b37277f039.toml". Submit an issue or email with the subject of "vartrix Crash Report" and include the report as an attachment.
We take privacy seriously, and do not perform any automated error collection. In order to improve the software, we rely on people to submit reports.
Thank you kindly!
However the temporary directory does not seem to exist
- This is the content of the `retag.err` file
[bam_sort_core] merging from 1 files and 1 in-memory blocks... [bam_sort_core] merging from 1 files and 1 in-memory blocks... [bam_sort_core] merging from 2 files and 1 in-memory blocks... [bam_sort_core] merging from 2 files and 1 in-memory blocks... [bam_sort_core] merging from 2 files and 1 in-memory blocks... [bam_sort_core] merging from 2 files and 1 in-memory blocks... [bam_sort_core] merging from 2 files and 1 in-memory blocks... [bam_sort_core] merging from 3 files and 1 in-memory blocks... [bam_sort_core] merging from 3 files and 1 in-memory blocks...
- This is the content of the `bcftools.err`
Writing to /tmp/bcftools-sort.d0HlEo Merging 1 temporary files Cleaning Done
@LorenzoMerotto and @wheaton5, did you work this out? We are running in to a similar issue where most pools have executed correctly but a couple haven't. They have been processed the same way upstream to this so the reason for failure is not clear. Any input you have would be fantastic!
Thanks for your help!
I think I need more information. Usually when vartrix fails, its due to a previous error. Probably freebayes failing. Can you check all .err files and also whether the vcf output from freebayes is empty?
Thanks for the fast response @wheaton5 !
I can't see anything in particular that jumps out as a problem with any of the preceding steps but I've put details below so hopefully you see something that we've missed.
Here's a summary of the files generated in the failed pool:
-rw-r--r-- 1 27794 Oct 27 13:47 fastqs.done
-rw-r--r-- 1 4698 Oct 27 17:25 minimap.err
-rw-r--r-- 1 2182 Oct 27 17:25 remapping.done
-rw-r--r-- 1 1023 Oct 27 17:51 retag.err
-rw-r--r-- 1 43123150821 Oct 27 20:52 souporcell_minimap_tagged_sorted.bam
-rw-r--r-- 1 6012584 Oct 27 21:08 souporcell_minimap_tagged_sorted.bam.bai
-rw-r--r-- 1 0 Oct 27 21:09 retagging.done
-rw-r--r-- 1 62832057 Oct 27 21:30 depth_merged.bed
-rw-r--r-- 1 367251435 Oct 27 21:31 common_variants_covered_tmp.vcf
-rw-r--r-- 1 367258040 Oct 27 21:31 common_variants_covered.vcf
-rw-r--r-- 1 135 Oct 27 21:31 variants.done
-rw-r--r-- 1 0 Oct 27 21:31 vartrix.out
-rw-r--r-- 1 605 Oct 27 22:37 vartrix.err
Here are the contest of each of the error files.
minimap.err
:
[M::mm_idx_gen::60.303*1.79] collected minimizers
[M::mm_idx_gen::68.434*2.98] sorted minimizers
[M::main::68.434*2.98] loaded/built the index for 194 target sequence(s)
[M::mm_mapopt_update::68.434*2.98] mid_occ = 1000
[M::mm_idx_stat] kmer size: 21; skip: 11; is_hpc: 0; #seq: 194
[M::mm_idx_stat::77.292*2.75] distinct minimizers: 381286575 (95.43% are singletons); average occurrences: 1.291; average spacing: 6.295
[M::worker_pipeline::84.520*3.71] mapped 263158 sequences
[M::worker_pipeline::90.321*4.51] mapped 263158 sequences
[M::worker_pipeline::94.441*5.02] mapped 263158 sequences
[M::worker_pipeline::100.680*5.71] mapped 263158 sequences
[M::worker_pipeline::108.147*6.41] mapped 263158 sequences
[M::worker_pipeline::114.296*6.94] mapped 263158 sequences
[M::worker_pipeline::121.092*7.45] mapped 263158 sequences
[M::worker_pipeline::128.174*7.93] mapped 263158 sequences
[M::worker_pipeline::136.020*8.41] mapped 263158 sequences
[M::worker_pipeline::141.022*8.69] mapped 263158 sequences
[M::worker_pipeline::148.326*9.05] mapped 263158 sequences
[M::worker_pipeline::154.288*9.32] mapped 263158 sequences
[M::worker_pipeline::161.255*9.62] mapped 263158 sequences
[M::worker_pipeline::169.079*9.92] mapped 263158 sequences
[M::worker_pipeline::173.876*10.10] mapped 263158 sequences
[M::worker_pipeline::177.756*10.23] mapped 263158 sequences
[M::worker_pipeline::189.109*10.57] mapped 263158 sequences
[M::worker_pipeline::192.414*10.67] mapped 263158 sequences
[M::worker_pipeline::196.717*10.79] mapped 263158 sequences
[M::worker_pipeline::200.775*10.90] mapped 263158 sequences
[M::worker_pipeline::208.185*11.08] mapped 263158 sequences
[M::worker_pipeline::215.253*11.25] mapped 263158 sequences
[M::worker_pipeline::220.922*11.38] mapped 263158 sequences
[M::worker_pipeline::227.564*11.52] mapped 263158 sequences
[M::worker_pipeline::232.541*11.62] mapped 263158 sequences
[M::worker_pipeline::238.969*11.74] mapped 263158 sequences
[M::worker_pipeline::246.172*11.87] mapped 263158 sequences
[M::worker_pipeline::251.254*11.96] mapped 263158 sequences
[M::worker_pipeline::255.360*12.03] mapped 263158 sequences
[M::worker_pipeline::261.302*12.12] mapped 263158 sequences
[M::worker_pipeline::265.206*12.18] mapped 263158 sequences
[M::worker_pipeline::271.390*12.27] mapped 263158 sequences
[M::worker_pipeline::278.991*12.38] mapped 263158 sequences
[M::worker_pipeline::284.412*12.45] mapped 263158 sequences
[M::worker_pipeline::290.930*12.54] mapped 263158 sequences
[M::worker_pipeline::296.213*12.60] mapped 263158 sequences
[M::worker_pipeline::303.196*12.68] mapped 263158 sequences
[M::worker_pipeline::308.148*12.74] mapped 263158 sequences
[M::worker_pipeline::314.887*12.81] mapped 263158 sequences
[M::worker_pipeline::321.757*12.88] mapped 263158 sequences
[M::worker_pipeline::326.717*12.93] mapped 263158 sequences
[M::worker_pipeline::331.553*12.98] mapped 263158 sequences
[M::worker_pipeline::337.556*13.04] mapped 263158 sequences
[M::worker_pipeline::344.316*13.10] mapped 263158 sequences
[M::worker_pipeline::349.462*13.15] mapped 263158 sequences
[M::worker_pipeline::354.906*13.19] mapped 263158 sequences
[M::worker_pipeline::368.196*13.30] mapped 263158 sequences
[M::worker_pipeline::373.878*13.34] mapped 263158 sequences
[M::worker_pipeline::377.816*13.37] mapped 263158 sequences
[M::worker_pipeline::384.011*13.42] mapped 263158 sequences
[M::worker_pipeline::390.742*13.47] mapped 263158 sequences
[M::worker_pipeline::395.201*13.50] mapped 263158 sequences
[M::worker_pipeline::400.524*13.54] mapped 263158 sequences
[M::worker_pipeline::406.876*13.58] mapped 263158 sequences
[M::worker_pipeline::411.816*13.62] mapped 263158 sequences
[M::worker_pipeline::415.891*13.64] mapped 263158 sequences
[M::worker_pipeline::424.468*13.69] mapped 263158 sequences
[M::worker_pipeline::428.557*13.68] mapped 110770 sequences
[M::main] Version: 2.7-r654
[M::main] CMD: minimap2 -ax splice -t 16 -G50k -k 21 -w 11 --sr -A2 -B8 -O12,32 -E2,1 -r200 -p.5 -N20 -f1000,5000 -n2 -m20 -s40 -g2000 -2K50m --secondary=no genome.fa tmp.fq
[M::main] Real time: 429.353 sec; CPU: 5863.578 sec
mapping
minimap2 -ax splice -t 16 -G50k -k 21 -w 11 --sr -A2 -B8 -O12,32 -E2,1 -r200 -p.5 -N20 -f1000,5000 -n2 -m20 -s40 -g2000 -2K50m --secondary=no genome.fa tmp.fq
retag.err
:
[bam_sort_core] merging from 9 files and 1 in-memory blocks...
[bam_sort_core] merging from 16 files and 1 in-memory blocks...
[bam_sort_core] merging from 17 files and 1 in-memory blocks...
[bam_sort_core] merging from 20 files and 1 in-memory blocks...
[bam_sort_core] merging from 21 files and 1 in-memory blocks...
[bam_sort_core] merging from 21 files and 1 in-memory blocks...
[bam_sort_core] merging from 21 files and 1 in-memory blocks...
[bam_sort_core] merging from 21 files and 1 in-memory blocks...
[bam_sort_core] merging from 21 files and 1 in-memory blocks...
[bam_sort_core] merging from 21 files and 1 in-memory blocks...
[bam_sort_core] merging from 21 files and 1 in-memory blocks...
[bam_sort_core] merging from 21 files and 1 in-memory blocks...
[bam_sort_core] merging from 23 files and 1 in-memory blocks...
[bam_sort_core] merging from 23 files and 1 in-memory blocks...
[bam_sort_core] merging from 24 files and 1 in-memory blocks...
[bam_sort_core] merging from 33 files and 1 in-memory blocks...
vartrix.err
:
Well, this is embarrassing.
vartrix had a problem and crashed. To help us diagnose the problem you can send us a crash report.
We have generated a report file at "/tmp/report-4980e9d6-bfc5-407e-9cec-3ba62c19145b.toml". Submit an issue or email with the subject of "vartrix Crash Report" and include the report as an attachment.
We take privacy seriously, and do not perform any automated error collection. In order to improve the software, we rely on people to submit reports.
Thank you kindly!
The freebayes vcf looks normal to me and has 1,181,356 variants. Here's the top and bottom of the file:
chr1 788439 1:788439:T:A T A . PASS AF=0.07287;MAF=0.07287;R2=0.47297;IMPUTED;AC=2;AN=22 GT:DS:GP 0|0:0.038:0.962,0.037,0 0|0:0.182:0.823,0.171,0.005 1|0:0.893:0.107,0.893,0 0|0:0.059:0.942,0.057,0.001 0|0:0.072:0.93,0.069,0.001 0|0:0.173:0.834,0.158,0.007 0|1:0.68:0.383,0.553,0.064 0|0:0.055:0.945,0.054,0.001 0|0:0.058:0.943,0.057,0.001 0|0:0.059:0.942,0.057,0.001 0|0:0.058:0.943,0.056,0.001 chr1 791101 1:791101:T:G T G . PASS AF=0.83234;MAF=0.16766;R2=0.41176;IMPUTED;AC=19;AN=22 GT:DS:GP 1|1:1.676:0.026,0.272,0.702 1|1:1.483:0.035,0.448,0.517 1|0:0.964:0.036,0.964,0 1|1:1.841:0.006,0.146,0.848 1|1:1.83:0.007,0.156,0.837 1|0:1.227:0.098,0.578,0.325 1|0:1.139:0.126,0.609,0.2651|1:1.852:0.005,0.137,0.857 1|1:1.843:0.006,0.145,0.849 1|1:1.85:0.006,0.139,0.855 1|1:1.844:0.006,0.144,0.85
...
chr9 138122079 9:138122079:C:T C T . PASS AF=0.80279;MAF=0.19721;R2=0.86081;IMPUTED;AC=13;AN=22 GT:DS:GP 0|1:0.978:0.022,0.978,0 1|1:1.92:0,0.079,0.921 0|1:1.021:0.001,0.977,0.022 0|1:0.979:0.022,0.977,0.001 0|1:0.969:0.031,0.969,0 0|1:1.018:0.002,0.979,0.019 0|1:0.986:0.029,0.956,0.015 0|1:0.975:0.025,0.975,0 1|1:1.862:0.004,0.129,0.866 0|1:0.994:0.011,0.985,0.005 0|1:0.988:0.012,0.988,0 chr9 138123517 9:138123517:C:T C T . PASS AF=0.36297;MAF=0.36297;R2=0.85195;IMPUTED;AC=7;AN=22 GT:DS:GP 0|1:0.985:0.015,0.985,0 0|0:0.25:0.751,0.248,0.001 0|1:0.983:0.017,0.982,0 0|0:0.005:0.995,0.005,0 0|1:0.98:0.02,0.98,0 0|1:0.997:0.01,0.982,0.007 0|1:0.976:0.024,0.976,0 0|0:0.003:0.997,0.003,0 0|1:1.155:0.044,0.758,0.198 0|1:0.981:0.019,0.98,0 0|0:0.003:0.997,0.003,0
Let me know if you see something that we're missing or if there are additional details we can provide to help identify the issue.
What is the deal with the multisample vcf? Freebayes is run in a mode which is unknown mixed samples and outputs a single sample vcf i thought
Are u using known_genotypes? Can you post your command line arguments?
We are not using known_genotypes
but we are using common_variants
and the vcf we're using is a vcf that has the variants for the individuals in the pool. This is typically how we run souporcell so I don't think that is likely to be causing the error. Here's the command being run:
souporcell_pipeline.py \
-i $BAM \
-b $BARCODES \
-f $FASTA \
-t $THREADS \
-o $SOUPORCELL_OUTDIR \
-k $N \
--common_variants $VCF
You could try running vartrix manually with the latest vartrix? I made a new singularity build recently to include hisat2 which gives better alignments for variant calling and i could update vartix as well if that fixes things.
We're trying that now - will let you know how it goes.
I hadn't seen that you had made a new singularity build. We'll take a look and see if the updated version helps sort things out
Its not up yet. Im testing it now.
@drneavin I solved it by running the analysis through conda. I created a new env and installed the required dependencies
Great, thanks both! I can confirm that the issue was resolved with the newest version of vartrix. @wheaton5, might be good to update it in the new image as well.
thanks, i will update it in the new singularity build
I am also having this issue. I also posted in the demuxafy board as I am using the Demuxafy singularity image to run souporcell. Have the images been updated since this discussion? Or should I also run vartrix separtely ? Thanks!
Hi @drneavin , may I ask how the assignment of clusters to individuals in the pool is usually done in your case following this command? I'm a bit confused by the VCF files given to known_genotypes
and common_variants
. On some of my pooled samples, my initial attempts including known_genotypes
and known_genotypes_sample_names
couldn't complete and stalled at the clustering. I'd like to give it a try using the command option you recommended if it works well. Thank you so much!
We are not using
known_genotypes
but we are usingcommon_variants
and the vcf we're using is a vcf that has the variants for the individuals in the pool. This is typically how we run souporcell so I don't think that is likely to be causing the error. Here's the command being run:souporcell_pipeline.py \ -i $BAM \ -b $BARCODES \ -f $FASTA \ -t $THREADS \ -o $SOUPORCELL_OUTDIR \ -k $N \ --common_variants $VCF
Hi @Angel-Wei, I have a put together some wrappers for demultiplexing and doublet detecting methods in Demuxafy. The script I think you're looking for will correlate the genotypes for the vcf output by souporcell compared to the your vcf after running souporcell which you can find here. Or if you just want to run the script without downloading the Demuxafy singularity image, you can find that script here.
If you have any followup questions about Demuxafy or this script, it would probably be best to open an issue here.
Hi @drneavin ! Thank you so much for the quick response! Yes, I was also looking at Demuxafy as well and the documentation was really clear to follow. I guess it was my misunderstanding that I thought there was another pipeline I wasn't aware other than Demuxafy. I can surely proceed with that. Thank you so much!
Hi @drneavin ! Sorry to bug you again. But if you don't mind, can I ask one more question? I wonder is there supposed to be any difference between using common_variants
or not using when running the pipeline in a genotype-free manner (like not using known_genotypes
and known_genotypes_sample_names
)? My attempt on this recommended command hasn't been completed, but I assume including common_variants
will output a common_variants_covered.vcf file compared to not including it? Thank you so much!
Hi @wheaton5
I ran the souporcell_latest.sif pipeline (using singularity) successfully for 14 of my 16 libraries. In two of them I got an error and tracked back to vartrix (in the vartrix.err). The error is this:
Traceback (most recent call last): File "/opt/souporcell/souporcell_pipeline.py", line 589, in <module> vartrix(args, final_vcf, bam) File "/opt/souporcell/souporcell_pipeline.py", line 512, in vartrix subprocess.check_call(cmd, stdout = out, stderr = err) File "/usr/local/envs/py36/lib/python3.6/subprocess.py", line 311, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['vartrix', '--mapq', '30', '-b', '/home/yaraScratch/souporcell-F1678CM-AB2-Sc-4-1/souporcell_minimap_tagged_sorted.bam', '-c', '/home/yara/Scratch/souporcell-F1678CM-AB2-Sc-4-1/barcodes.tsv', '--scoring-method', 'coverage', '--threads', '8', '--ref-matrix', '/home/yara/Scratch/souporcell-F1678CM-AB2-Sc-4-1/ref.mtx', '--out-matrix', '/home/yara/Scratch/souporcell-F1678CM-AB2-Sc-4-1/alt.mtx', '-v', '/home/yara/Scratch/souporcell-F1678CM-AB2-Sc-4-1/souporcell_merged_sorted_vcf.vcf.gz', '--fasta', '/home/yara/Scratch/references/refdata-cellranger-GRCh38-3.0.0/fasta/genome.fa', '--umi']' returned non-zero exit status 101.
I emailed the crash reports to the authors and they replied that I should try with a newest version of vartrix (https://github.com/10XGenomics/vartrix/releases/tag/v1.1.22). So, my questions are: is there a way around this? how could I do this? would it be possible for you to add this to the souporcell_latest.sif?
Many thanks for your help!