Closed jjfarrell closed 2 years ago
can you run duphold on just that sample and see what happens?
with this many samples, it might be better to do the parallelization yourself rather than use smoove duphold
which relies on a single machine (maybe that's what you're doing, in part, with xargs, but I can't tell).
I am not sure how to determine which cram/sample corresponds to the bcf file, dh009936222smoove-duphold.bcf, since it gets erased on exit. I am running smoove duphold on one compute node with 196 GB and 32 cores for each chr. It runs surprisingly fast. I am guessing it is running out of memory on the merge. If the files were not deleted, I could play with getting the merge going. Is there a debug option to not delete the temp bcf files?
I did find a snapshot of the temp bcf directory that contained the bcf that failed to load the csi. The bcf looked fine and was indexed. I was able to merge it with another bcf. So I think this is likely a bcftools merge issue.
yeah. bcftools merge has problems with many SV vcfs. I've seen that in smoove square as well. I think it has trouble when a single huge (likely spurious) event spans the entire chromosome and I guess that it keeps that in memory, along with all the variants in contains.
The issue is the number of open files allowed on linux. The default for our cluster is 1024:
ulimit -n
1024
This can be increased to 2048. On our system, we can not increase it to a larger number.
ulimit -n 30000
-bash: ulimit: open files: cannot modify limit: Operation not permitted
A workaround for this on Biostar uses the following approach:
ls *.vcf.gz | split -l 500 - subset_vcfs
for i in subset_vcfs*;
do
bcftools merge -0 -l $i -Oz -o merge.$i.vcf.gz;
tabix -p vcf merge.$i.vcf.gz
done
ls merge.*.vcf.gz > merge.txt
bcftools merge -l merge.txt -0 -Oz -o all_merged.vcf.gz
The list of files to merge are split into subsets under the ulimit -n size and then merged. Is this approach something that could be implemented in smoove duphold for merging?
I have put in a request to to the our IT support to increase the ulimit soft limit max. Hopefully that will solve the issue at our site.
If the hierarchical merge above is not feasible- It would be great if a the number of files passed to smoove duphold could be checked again the ulimit -n on startup. If the N of sample files are too large, stop the job with an error message to increase the ulimit -n to N files. Right now the tool will run duphold on 20000+ samples successfully and then terminate at the merge step.
I am surprised that you haven't hit this limit before! The best way would be to run duphold for each sample, then use svtools merge to merge the VCFs
Actually, I found the max was set to 20K on the batch queue jobs which is why I hit the error this time. It was set to the lower number on the login node and when sshing into a node. So the higher ulimit max is now working fine with smoove. Just an earlier check and better error message would be helpful.
thanks for following up. I plan to revisit this merging step at some point.
When running the following command on 22,847 crams, I am getting an "could not load index" from bcftools. See output below.
cat cram_22847.list|xargs -s 2080845 smoove duphold --processes $NSLOTS --fasta /restricted/projectnb/casa/ref/GRCh38_full_analysis_set_plus_decoy_hla.fa --vcf pVCF.ont/ont.graphtyper.chr$N.vcf.gz --outvcf pVCF.ont/ont.graphtyper.chr$N.duphold.vcf.gz
This is the beginning of the output: [smoove] 2022/01/09 00:08:14 starting with version 0.2.6 [smoove] 2022/01/09 00:08:14 running duphold on 22847 files in 32 processes [smoove] 2022/01/09 00:09:24 [duphold] finished [smoove] 2022/01/09 00:09:25 [duphold] finished
This is the error message at the end:
tail smoove_duphold_ont.sh.o1742754 [smoove] 2022/01/09 05:09:00 [duphold] finished [smoove] 2022/01/09 05:09:01 [duphold] finished [smoove] 2022/01/09 05:09:02 [duphold] finished [smoove] 2022/01/09 05:09:02 [duphold] finished [smoove] 2022/01/09 05:09:03 [duphold] finished [smoove] 2022/01/09 05:09:04 [duphold] finished [smoove] 2022/01/09 05:09:04 starting bcftools merge [smoove] 2022/01/09 11:37:48 [E::hts_idx_load3] Could not load local index file 'tmp.22/tempclean-606560922/dh009936222smoove-duphold.bcf.csi' Failed to open tmp.22/tempclean-606560922/dh009936222smoove-duphold.bcf: could not load index 2022/01/09 11:38:53 exit status 255
The temp files are cleaned up on exiting so I am not able to look at the bcf files to see what may be happening.
Any suggestions for this?