EichlerLab / smrtsv2

Structural variant caller
MIT License
53 stars 6 forks source link

Fail in call_variants #53

Closed zihhuafang closed 4 years ago

zihhuafang commented 4 years ago

Hi,

I am running the call phase of SMRTSV2 using Pacbio reads from cattle, and the pipeline failed with the following error message.

Traceback (most recent call last):

  File "/cluster/work/pausch/fang/smrtsv2/scripts/call/PrintSNVSupport.py", line 40, in <module>
    v = prevVals[0:5] + [str(len(support))] + [";".join(support)]
TypeError: 'NoneType' object has no attribute '__getitem__'
Traceback (most recent call last):
  File "/cluster/work/pausch/fang/smrtsv2/scripts/call/PrintSNVSupport.py", line 40, in <module>
    v = prevVals[0:5] + [str(len(support))] + [";".join(support)]
TypeError: 'NoneType' object has no attribute '__getitem__'
[Fri Feb 28 20:07:15 2020]
Error in rule call_remove_indel_events_in_homopolymers:
jobid: 0
output: call/indel_calls/ins/gaps_without_homopolymers.bed

[Fri Feb 28 20:07:15 2020]
Error in rule call_remove_indel_events_in_homopolymers:
jobid: 0
output: call/indel_calls/del/gaps_without_homopolymers.bed

RuleException:
CalledProcessError in line 256 of /cluster/work/pausch/fang/smrtsv2/rules/call.snakefile:
Command ' set -euo pipefail;  awk '$11 == "F"' call/indel_calls/del/gaps.bed | cut -f 1,2,3,5,6,8 | sort -k 1,1 -k 2,2n -k 4,4n | python2 -s /cluster/work/pausch/fang/smrtsv2/scripts/call/PrintSNVSupport.py /dev/stdin /dev/stdout > call/indel_calls/del/gaps_without_homopolymers.bed ' returned non-zero exit status 1.
  File "/cluster/work/pausch/fang/smrtsv2/rules/call.snakefile", line 256, in __rule_call_remove_indel_events_in_homopolymers
  File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python3/lib/python3.6/concurrent/futures/thread.py", line 55, in run
RuleException:
CalledProcessError in line 256 of /cluster/work/pausch/fang/smrtsv2/rules/call.snakefile:
Command ' set -euo pipefail;  awk '$11 == "F"' call/indel_calls/ins/gaps.bed | cut -f 1,2,3,5,6,8 | sort -k 1,1 -k 2,2n -k 4,4n | python2 -s /cluster/work/pausch/fang/smrtsv2/scripts/call/PrintSNVSupport.py /dev/stdin /dev/stdout | python2 -s /cluster/work/pausch/fang/smrtsv2/scripts/call/BedMod.py /dev/stdin call/indel_calls/ins/gaps_without_homopolymers.bed --leftjustify 1 ' returned non-zero exit status 1.
  File "/cluster/work/pausch/fang/smrtsv2/rules/call.snakefile", line 256, in __rule_call_remove_indel_events_in_homopolymers
  File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python3/lib/python3.6/concurrent/futures/thread.py", line 55, in run
Exiting because a job execution failed. Look above for error message.

Traceback (most recent call last):
Traceback (most recent call last):
  File "/cluster/work/pausch/fang/smrtsv2/scripts/call/cluster_calls.py", line 76, in <module>
  File "/cluster/work/pausch/fang/smrtsv2/scripts/call/cluster_calls.py", line 76, in <module>
    columns_per_call = len(list(calls[0]))
  File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python2/lib/python2.7/site-packages/pybedtools/bedtool.py", line 1164, in __getitem__
    columns_per_call = len(list(calls[0]))
  File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python2/lib/python2.7/site-packages/pybedtools/bedtool.py", line 1164, in __getitem__
    return list(islice(self, key, key + 1))[0]
IndexError    : return list(islice(self, key, key + 1))[0]
list index out of range
IndexError: list index out of range
[Fri Feb 28 20:10:55 2020]
Error in rule call_identify_calls_by_type:
    jobid: 39
    output: call/sv_calls/calls.del.bed

RuleException:
CalledProcessError in line 500 of /cluster/work/pausch/fang/smrtsv2/rules/call.snakefile:
Command ' set -euo pipefail;  awk 'tolower($4) == "del" && index($6, "N") == 0' call/sv_calls/gaps.bed | awk 'OFS="\t" { if ("del" == "ins") { $3=$2 + 1 } print }' | python2 -s /cluster/work/pausch/fang/smrtsv2/scripts/call/cluster_calls.py --window 20 --reciprocal_overlap 0.5 /dev/stdin intersect | awk 'OFS="\t" { if ("del" == "ins") { $3=$2 + $5 } print }' | sort -k 1,1 -k 2,2n | while read line; do set -- $line; coverage=`samtools view -c assemble/local_assemblies.bam $1:$2-$3`; echo -e "$line\t$coverage"; done > call/sv_calls/calls.del.bed ' returned non-zero exit status 1.
  File "/cluster/work/pausch/fang/smrtsv2/rules/call.snakefile", line 500, in __rule_call_identify_calls_by_type
  File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python3/lib/python3.6/concurrent/futures/thread.py", line 55, in run
Removing output files of failed job call_identify_calls_by_type since they might be corrupted:
call/sv_calls/calls.del.bed

I suspected that the files (tiling_contigs.tab, assembled_contigs.depth.bed and gaps.bed ) were not correctly generated at the beginning the pipeline because they were empty, and subsequently all the downstream files were also empty. I am not sure what the problem was. In the /call directory I see the following

total 1.5K
---------- 1 fangzi    0 Feb 28 20:07 tiling_contigs.tab
d--------- 4 fangzi 4.0K Feb 28 20:07 indel_calls
---------- 1 fangzi    0 Feb 28 20:10 assembled_contigs.depth.bed
---------- 1 fangzi 1.3K Feb 28 20:10 inversions.vcf
d--------- 2 fangzi 4.0K Feb 28 20:14 sv_calls

I ran with the following command:

${SMRTSV_DIR}/smrtsv --wait-time 60 --tempdir ${TMP} \
        run \
        --runjobs  "25,20,80,10" \
        --species B.taurus \
        --threads 64 \
        --exclude /cluster/work/pausch/fang/UCD_genome/contig.bed \
        ${REF_FA} \
        ${FOFN}

Could you suggest how to fix it? Thank you in advance.

paudano commented 4 years ago

First, make sure that assemble/local_assemblies.bam has contigs in it. If it is empty, then the assemblies failed, and variant calling is failing because of that. You can look through some of the assemble/group/GROUP_ID/contig_group_bm.log log files (where "GROP_ID" is a range of grouped assemblies) to see why it is failing.

If the assemblies are there, remove all of the call directory and re-run. If it still fails, see if you can find the output for rule call_calculate_coverage_from_assembled_contigs (may be in log directory if SMRT-SV was distributed over a cluster).