samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
682 stars 240 forks source link

bcftools mpileup resulting in downstream unparsable vcf record with allele M #2256

Open anscholes opened 3 months ago

anscholes commented 3 months ago

Hello,

I am currently trying to call SNPs to generate an alternative fasta sequence. Whenever I combine BAM files using bcftools mpileup (bcftools version 1.20) everything runs properly until I try to use GATK IndexFeautureFile code and them I am left with an error of "The provided VCF file is malformed at approximately line number 25863560: unparsable vcf record with allele M".

When I check the combined VCF file I do see the M allele (see below) AM270990.1 518406 . M MA 999 PASS INDEL;IDV=1;IMF=0.0714286;DP=77;VDB=3.06326e-29;SGB=-3.22514;RPBZ=-2.95767;MQBZ=0.986013;MQSBZ=0.930949;SCBZ=0;MQ0F=0; AF1=1;AC1=5;DP4=1,1,65,8;MQ=42;FQ=-243.559;PV4=0.227027,1,1,1 GT:PL 1:255,42,0 1:243,48,0 1:255,29,0 1:246,60,0 1:200,24,0

When I check the single VCF files (non combined) I do not see this M allele or position 518406 on chr AM270990.1

Code:bcftools mpileup -Ou -f Aniger_Reference_Files/GCA_000002855.2_ASM285v2_genomic.fna Sorted_Mapped_Trimmed_CTWT-1A.bam Sorted_Mapped_Trimmed_CTWT-1B.bam > Sorted_Mapped_Trimmed_CTWT_AB.bcf

pd3 commented 3 months ago

This looks like the IUPAC ambiguity code. I suspect this comes from your reference file, bcftools prints only what it encounters. Can you check the output of

samtools faidx Aniger_Reference_Files/GCA_000002855.2_ASM285v2_genomic.fna AM270990.1:518406-518406