luntergroup / octopus

Bayesian haplotype-based mutation calling
MIT License
302 stars 38 forks source link

Octopus outputs invalid alternate allele #100

Closed DBS-Max closed 4 years ago

DBS-Max commented 4 years ago

Describe the bug Trying to impute 1x sample using Beagle5.1 and received the following error on vcf file produced by octopus: ERROR: invalid ALT allele at chr1:9144584 [*AAA]

Version

$ octopus --version
octopus version 0.6.3-beta (release/0.6.3-beta 8b0e2968)
Target: x86_64 Linux 5.0.0-1023-azure
Compiler: GNU 8.3.0
Boost: 1_71

and

octopus version 0.7.0 (develop 0aa9e574)
Target: x86_64 Linux 5.0.0-1023-azure
SIMD extension: AVX2
Compiler: GNU 9.2.0
Boost: 1_71
DBS-Max commented 4 years ago

Ah, read the documentation, gotta turn the --legacy option on. Feel free to close!

DBS-Max commented 4 years ago

The develop version doesn't recognize the --legacy command, 0.6.3b does.

DBS-Max commented 4 years ago

Command (octopus 0.6.3b):

octopus \
      -R Homo_sapiens_assembly38.fasta \
      -I 1_NA12878.bam \
      -t octopus_intervals \
      -o 1_NA12878.bcf \
      --sequence-error-model PCR-FREE.NOVASEQ \
      --threads 16 \
      --legacy

Still outputs alternate alleles with a * causing beagle to throw an error.

dancooke commented 4 years ago

I'm removing the --legacy option in v0.7.0. I'll probably include a script to replicate the functionality.

DBS-Max commented 4 years ago

Okay,

How should I go about it in the meantime? Just get rid of all the * characters?

Cheers,

Max.

dancooke commented 4 years ago

How should I go about it in the meantime? Just get rid of all the * characters?

You will need to remove any ALT alleles containing a * character, and then set any GT fields pointing to these alleles to 0 (i.e. REF). You might also need to recompute any annotation values that refer to ALT alleles (e.g. AC).

DBS-Max commented 4 years ago

Okay, I just deleted all lines with a *, easy enough for now.

dancooke commented 4 years ago

Closing as not a bug.