bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
994 stars 354 forks source link

parsedAlternates: alignment does not start with match over padded sequence #3206

Closed chatchawit closed 4 years ago

chatchawit commented 4 years ago

Version info

YAML file

details:

Error Message

subprocess.CalledProcessError: Command 'set -o pipefail; gunzip -c /home/bcbio/peat/work/gatk-haplotype/Glaucoma.vcf.gz | bcftools view -f 'PASS,.' --min-ac 1:nref | vcfallelicprimitives -t DECOMPOSED --keep-geno | sed 's/ID=AD,Number=./ID=AD,Number=R/' | vt decompose -s - | vt normalize -n -r /home/bcbio/install/stable/genomes/Hsapiens/hg38/seq/hg38.fa - | awk '{ gsub("./-65", "./."); print $0 }' | sed -e 's/Number=A/Number=1/g' | bgzip -c > /home/bcbio/peat/tmp/tmpbtjsuwij/Glaucoma-noeff-decompose.vcf.gz

parsedAlternates: alignment does not start with match over padded sequence 15M4I9M1S ZZZZZZZZZZQCCCZZZZZZZZZZ ZZZZZZZZZZQNON_REF>ZZZZZZZZZZ

I found that "vcfallelicprimitives" causes the error.

naumenko-sa commented 4 years ago

Not sure, if that immediately helps, but you are using jointcaller + ensemble. Jointcaller is for population calling, ensemble is for combining calls from multiple callers (gatk, vardict). jointcaller + ensemble should not be used together.

variantcaller: gatk-haplotype
jointcaller: gatk-haplotype-joint
ensemble:
    numpass: 3

SN

naumenko-sa commented 4 years ago

closing for now. Feel free to re-open if you still see the issue here.