niemasd / ViReflow

An elastically-scaling automated AWS pipeline for viral consensus sequence generation
GNU General Public License v3.0
10 stars 2 forks source link

Possible bug when handling positions with multiple alternate alleles #3

Closed niemasd closed 3 years ago

niemasd commented 3 years ago

When positions in the VCF have multiple alternate alleles, I don't think my naive bash/awk command is guaranteed to produce the correct output. Perhaps write a simple Python script that's general to any arbitrary VCF

niemasd commented 3 years ago

demo1.variants.vcf.txt

niemasd commented 3 years ago

Rather than trying to use awk and grep to do the filtering, I decided to write a Python script to do it:

https://github.com/Niema-Docker/bcftools/blob/main/alt_vars.py

niemasd commented 3 years ago

Fix incorporated in: https://github.com/niemasd/ViReflow/commit/766248859e369f3c7c38a7676a83298af67de8c3