hall-lab / svtyper

Bayesian genotyper for structural variants
MIT License
125 stars 55 forks source link

SVtyper questions / issues #99

Open calhoujd opened 5 years ago

calhoujd commented 5 years ago

I've been spending the last few days becoming familiar with Lumpy + SVtyper, and starting to get the hang of them. I've run into an issue that I haven't been able to solve on my own (& also a question about whether a certain behavior is normal).

Issue: I wrote a script to parallelize SVtyper runs by chromosome, because I've been having trouble getting jobs to complete on full WGS trios. I ran for the first time overnight last night, and about half of the chromosomes ran fine and finished within 1-4 hours. However, some are still running (currently 16 hrs and counting). I can't quite discern what is causing this, for example chr 8 is still running, while chr 9 finished quite quickly. Any thoughts would be greatly appreciated.

Question: I've noticed that when I run SVtyper, it doesn't deposit lines in real-time to the output .gt.vcf. The output file is generated, but sits empty for the entire run. Then, just as the run is being completed, the output file gets filled all at once. Is this normal? Am I doing something wrong?

Here is the code I've been using:

# module load python module load java module load samtools

LUMPYexpress -x btu356_LCR-hs37d5.bed -B proband.chr1.bam,mother.chr1.bam,father.chr1.bam -S proband.chr1.splitters.bam,mother.chr1.splitters.bam,father.chr1.splitters.bam -D proband.chr1.discordants.bam,mother.chr1.discordants.bam,father.chr1.discordants.bam -o trio.chr1.vcf

svtyper -B proband.chr1.bam,mother.chr1.bam, father.chr1.bam -S proband.chr1.splitters.bam,mother.chr1.splitters.bam, father.chr1.splitters.bam -i trio.chr1.vcf > trio.chr1.gt.vcf #

PS - Currently using LUMPY v 0.2.13 & svtyper v ??? (I'm not entirely sure which version and also not sure how to check...I checked the readme.md and didn't see it listed)