arq5x / lumpy-sv

lumpy: a general probabilistic framework for structural variant discovery
MIT License
307 stars 119 forks source link

Lumpyexpress produces no output and no errors for one sample #329

Closed calizilla closed 4 years ago

calizilla commented 4 years ago

Hi,

I have 9 sheep samples which I would like to jointly call with Lumpy express. The job dies at the 7th sample in the list. I ran each of these 9 samples individually (code below). 8 of them worked fine, but the same sample that caused the failure in joint calling runs without error yet produces no output.

Software versions:

lumpy-sv/0.3.0 samtools/1.10 sambamba/0.6.4 samblaster/0.1.24

Code for joint calling:

for (( i = 0; i < ${#samples[@]}; i++ )) do samp=${samples[$i]} lumpy_bamlist+=",${bam_in}/${samp}/${samp}.coordSorted.dedup.bam" disclist+=",${sd_in}/${samp}.disc.sort.bam" splitlist+=",${sd_in}/${samp}.split.sort.bam" done

lumpy_bamlist=$(echo $lumpy_bamlist | sed -r 's/^.{1}//') splitlist=$(echo $splitlist | sed -r 's/^.{1}//') disclist=$(echo $disclist | sed -r 's/^.{1}//')

temp dir

mkdir ${out}/Lumpy_temp

Call SVs with lumpy

lumpyexpress \ -B $lumpy_bamlist \ -S $splitlist \ -D $disclist \ -R $ref \ -o ${out}/${breed}.lumpySVs_nogeno.vcf \ -T ${out}/Lumpy_temp

Genotype lumpy SV calls with SVtyper

svtyper \ -B $lumpy_bamlist \ -i ${out}/${breed}.lumpySVs_nogeno.vcf \ -o ${out}/${breed}.lumpySVtyper.vcf \ -T $ref \ -w ${out}/${breed}.lumpySVtyper.bam

Error log for above job (VCF path and filename obscured for confidentiality reasons):

Removed 95 outliers with isize >= 856 Removed 96 outliers with isize >= 889 Removed 136 outliers with isize >= 812 Removed 134 outliers with isize >= 1187 Removed 164 outliers with isize >= 1169 Removed 77 outliers with isize >= 1191 Removed 116 outliers with isize >= 852 usage: svtyper [-h] [-i FILE] [-o FILE] -B FILE [-T FILE] [-l FILE] [-m INT] [-n INT] [-q] [--max_reads INT] [--max_ci_dist INT] [--split_weight FLOAT] [--disc_weight FLOAT] [-w FILE] [--verbose] svtyper: error: argument -i/--input_vcf: can't open '.vcf': [Errno 2] No such file or directory: '.vcf' Unknown parameter ".vcf". Run -h for help. at /usr/local/vcftools/0.1.14/bin/vcf-sort line 20. main::error("Unknown parameter \".l"...) called at /usr/local/vcftools/0.1.14/bin/vcf-sort line 46 main::parse_params() called at /usr/local/vcftools/0.1.14/bin/vcf-sort line 10

Code and standard output for the 7th sample that won't run:

$ lumpyexpress -B $bam -S $split -D $disc -R $ref -o ${out}/${sample}.lumpySVs_nogeno.vcf -T $temp -kv Sourcing executables from /usr/local/lumpy-sv/0.3.0/bin/lumpyexpress.config ...

Checking for required python modules (/usr/local/python/2.7.10/bin/python)...

create temporary directory

Calculating insert distributions... Library read groups: HFW2HBBXX.4_AUMEPM000000000017_1,HCKKTBBXX.3_AUMEPM000000000017_1,HCKKTBBXX.8_AUMEPM000000000017_1,HFW2HBBXX.3_AUMEPM000000000017_1,HFW2HBBXX.8_AUMEPM000000000017_1,HCKKTBBXX.2_AUMEPM000000000017_1,HCJVLBBXX.4_AUMEPM000000000017_1,HFW2HBBXX.2_AUMEPM000000000017_1,HFW2HBBXX.7_AUMEPM000000000017_1,HCKJMBBXX.3_AUMEPM000000000017_1,HCKJMBBXX.8_AUMEPM000000000017_1,HCKJMBBXX.2_AUMEPM000000000017_1,HCKJMBBXX.7_AUMEPM000000000017_1,HFW2HBBXX.6_AUMEPM000000000017_1,HFW2HBBXX.1_AUMEPM000000000017_1,HCKV5BBXX.8_AUMEPM000000000017_1,HCKV5BBXX.3_AUMEPM000000000017_1,HCKJMBBXX.6_AUMEPM000000000017_1,HCKV5BBXX.2_AUMEPM000000000017_1,HCKJMBBXX.1_AUMEPM000000000017_1,HCKV5BBXX.7_AUMEPM000000000017_1,HFW2HBBXX.5_AUMEPM000000000017_1,HCFL2BBXX.5_AUMEPM000000000017_1,HCKV5BBXX.6_AUMEPM000000000017_1,HCKV5BBXX.1_AUMEPM000000000017_1,HCKJMBBXX.5_AUMEPM000000000017_1,HCFL2BBXX.4_AUMEPM000000000017_1,HCLN7BBXX.1_AUMEPM000000000017_1,HCKJMBBXX.4_AUMEPM000000000017_1,HCKV5BBXX.5_AUMEPM000000000017_1,HCKKTBBXX.7_AUMEPM000000000017_1,HCKV5BBXX.4_AUMEPM000000000017_1,HCHGKBBXX.8_AUMEPM000000000017_1,HCKKTBBXX.1_AUMEPM000000000017_1,HCKKTBBXX.6_AUMEPM000000000017_1,HCJVLBBXX.8_AUMEPM000000000017_1,HCJVLBBXX.3_AUMEPM000000000017_1,HCHGKBBXX.7_AUMEPM000000000017_1,HCKKTBBXX.5_AUMEPM000000000017_1,HCJVLBBXX.7_AUMEPM000000000017_1,HCJVLBBXX.2_AUMEPM000000000017_1,HCKKTBBXX.4_AUMEPM000000000017_1,HCH2TBBXX.4_AUMEPM000000000017_1,HCJVLBBXX.6_AUMEPM000000000017_1,HCJVLBBXX.1_AUMEPM000000000017_1,HCHFCBBXX.5_AUMEPM000000000017_1 Library read length: 151 Removed 116 outliers with isize >= 852 done

For the other 8 successful samples, the last 'done' message was followed by this: 0 Running LUMPY... LUMPY Express done

and the VCF outut file created. For this error sample, no output files are created. Further, the '-k' keep temporary files flag does not keep the temporary files. During running the above command, the temp dir shows one file (suffix vcf.sample1.lib1.insert.stats) which remains empty for the run time of ~45 seconds.

There is nothing wrong with the BAM (albeit low coverage). I have also called SVs with Manta and small variants with GATK4 on this sample.

Do you have any suggestions for debugging? I might be able to share the BAM with you (permissions will needed to be sought).

Many thanks, Cali

ryanlayer commented 4 years ago

Hi, we are no longer supporting lumpyexpress and you should use smoove https://github.com/brentp/smoove . I really need to update the lumpy docs to reflect this switch.

If you have questions about getting this going please reach out to me directly.

On Wed, Mar 18, 2020 at 12:56 AM calliza notifications@github.com wrote:

Hi,

I have 9 sheep samples which I would like to jointly call with Lumpy express. The job dies at the 7th sample in the list. I ran each of these 9 samples individually (code below). 8 of them worked fine, but the same sample that caused the failure in joint calling runs without error yet produces no output.

Software versions:

lumpy-sv/0.3.0 samtools/1.10 sambamba/0.6.4 samblaster/0.1.24

Code for joint calling:

for (( i = 0; i < ${#samples[@]}; i++ )) do samp=${samples[$i]} lumpy_bamlist+=",${bam_in}/${samp}/${samp}.coordSorted.dedup.bam" disclist+=",${sd_in}/${samp}.disc.sort.bam" splitlist+=",${sd_in}/${samp}.split.sort.bam" done

lumpy_bamlist=$(echo $lumpy_bamlist | sed -r 's/^.{1}//') splitlist=$(echo $splitlist | sed -r 's/^.{1}//') disclist=$(echo $disclist | sed -r 's/^.{1}//')

temp dir

mkdir ${out}/Lumpy_temp

Call SVs with lumpy

lumpyexpress -B $lumpy_bamlist -S $splitlist -D $disclist -R $ref -o ${out}/${breed}.lumpySVs_nogeno.vcf -T ${out}/Lumpy_temp

Genotype lumpy SV calls with SVtyper

svtyper -B $lumpy_bamlist -i ${out}/${breed}.lumpySVs_nogeno.vcf -o ${out}/${breed}.lumpySVtyper.vcf -T $ref -w ${out}/${breed}.lumpySVtyper.bam

Error log for above job (VCF path and filename obscured for

confidentiality reasons): Removed 95 outliers with isize >= 856 Removed 96 outliers with isize >= 889 Removed 136 outliers with isize >= 812 Removed 134 outliers with isize >= 1187 Removed 164 outliers with isize >= 1169 Removed 77 outliers with isize >= 1191 Removed 116 outliers with isize >= 852 usage: svtyper [-h] [-i FILE] [-o FILE] -B FILE [-T FILE] [-l FILE] [-m INT] [-n INT] [-q] [--max_reads INT] [--max_ci_dist INT] [--split_weight FLOAT] [--disc_weight FLOAT] [-w FILE] [--verbose] svtyper: error: argument -i/--input_vcf: can't open '.vcf': [Errno 2] No such file or directory: '.vcf' Unknown parameter ".vcf". Run -h for help. at /usr/local/vcftools/0.1.14/bin/vcf-sort line 20. main::error("Unknown parameter ".l"...) called at /usr/local/vcftools/0.1.14/bin/vcf-sort line 46 main::parse_params() called at /usr/local/vcftools/0.1.14/bin/vcf-sort line 10

Code and standard output for the 7th sample that won't run:

$ lumpyexpress -B $bam -S $split -D $disc -R $ref -o ${out}/${sample}.lumpySVs_nogeno.vcf -T $temp -kv Sourcing executables from /usr/local/lumpy-sv/0.3.0/bin/lumpyexpress.config ...

Checking for required python modules (/usr/local/python/2.7.10/bin/python)...

create temporary directory

Calculating insert distributions... Library read groups: HFW2HBBXX.4_AUMEPM000000000017_1,HCKKTBBXX.3_AUMEPM000000000017_1,HCKKTBBXX.8_AUMEPM000000000017_1,HFW2HBBXX.3_AUMEPM000000000017_1,HFW2HBBXX.8_AUMEPM000000000017_1,HCKKTBBXX.2_AUMEPM000000000017_1,HCJVLBBXX.4_AUMEPM000000000017_1,HFW2HBBXX.2_AUMEPM000000000017_1,HFW2HBBXX.7_AUMEPM000000000017_1,HCKJMBBXX.3_AUMEPM000000000017_1,HCKJMBBXX.8_AUMEPM000000000017_1,HCKJMBBXX.2_AUMEPM000000000017_1,HCKJMBBXX.7_AUMEPM000000000017_1,HFW2HBBXX.6_AUMEPM000000000017_1,HFW2HBBXX.1_AUMEPM000000000017_1,HCKV5BBXX.8_AUMEPM000000000017_1,HCKV5BBXX.3_AUMEPM000000000017_1,HCKJMBBXX.6_AUMEPM000000000017_1,HCKV5BBXX.2_AUMEPM000000000017_1,HCKJMBBXX.1_AUMEPM000000000017_1,HCKV5BBXX.7_AUMEPM000000000017_1,HFW2HBBXX.5_AUMEPM000000000017_1,HCFL2BBXX.5_AUMEPM000000000017_1,HCKV5BBXX.6_AUMEPM000000000017_1,HCKV5BBXX.1_AUMEPM000000000017_1,HCKJMBBXX.5_AUMEPM000000000017_1,HCFL2BBXX.4_AUMEPM000000000017_1,HCLN7BBXX.1_AUMEPM000000000017_1,HCKJMBBXX.4_AUMEPM000000000017_1,HCKV5BBXX.5_AUMEPM000000000017_1,HCKKTBBXX.7_AUMEPM000000000017_1,HCKV5BBXX.4_AUMEPM000000000017_1,HCHGKBBXX.8_AUMEPM000000000017_1,HCKKTBBXX.1_AUMEPM000000000017_1,HCKKTBBXX.6_AUMEPM000000000017_1,HCJVLBBXX.8_AUMEPM000000000017_1,HCJVLBBXX.3_AUMEPM000000000017_1,HCHGKBBXX.7_AUMEPM000000000017_1,HCKKTBBXX.5_AUMEPM000000000017_1,HCJVLBBXX.7_AUMEPM000000000017_1,HCJVLBBXX.2_AUMEPM000000000017_1,HCKKTBBXX.4_AUMEPM000000000017_1,HCH2TBBXX.4_AUMEPM000000000017_1,HCJVLBBXX.6_AUMEPM000000000017_1,HCJVLBBXX.1_AUMEPM000000000017_1,HCHFCBBXX.5_AUMEPM000000000017_1 Library read length: 151 Removed 116 outliers with isize >= 852 done

For the other 8 successful samples, the last 'done' message was followed by this: 0 Running LUMPY... LUMPY Express done

and the VCF outut file created. For this error sample, no output files are created. Further, the '-k' keep temporary files flag does not keep the temporary files. During running the above command, the temp dir shows one file (suffix vcf.sample1.lib1.insert.stats) which remains empty for the run time of ~45 seconds.

There is nothing wrong with the BAM (albeit low coverage). I have also called SVs with Manta and small variants with GATK4 on this sample.

Do you have any suggestions for debugging? I might be able to share the BAM with you (permissions will needed to be sought).

Many thanks, Cali

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/329, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEUGUIWVT3VFYKKEQJBCGTRIBWBRANCNFSM4LOIQGMQ .