samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
671 stars 240 forks source link

MNV consequence annotation in local mode #1810

Closed JakeHagen closed 1 year ago

JakeHagen commented 1 year ago

Hello, does bcftools csq support multi nucleotide variants?

This MNV 1-201363379-CC-AA produces the CSQ below when run in local mode

missense|TNNT2|ENST00000660295|protein_coding|-|162EE>162D*|201363379CC>AA,missense|TNNT2|ENST00000236918|protein_coding|-|172EE>172D*|201363379CC>AA,missense|TNNT2|ENST00000360372|protein_coding|-|132EE>132D*|201363379CC>AA,missense|TNNT2|ENST00000367317|protein_coding|-|157EE>157D*|201363379CC>AA,missense|TNNT2|ENST00000367318|protein_coding|-|162EE>162D*|201363379CC>AA,missense|TNNT2|ENST00000422165|protein_coding|-|172EE>172D*|201363379CC>AA,missense|TNNT2|ENST00000656932|protein_coding|-|172EE>172D*|201363379CC>AA,missense|TNNT2|ENST00000367322|protein_coding|-|161EE>161D*|201363379CC>AA,missense|TNNT2|ENST00000666449|NMD|-|162EE>162D*|201363379CC>AA,missense|TNNT2|ENST00000658476|protein_coding|-|162EE>162D*|201363379CC>AA,missense|TNNT2|ENST00000412633|protein_coding|-|162EE>162D*|201363379CC>AA,missense|TNNT2|ENST00000367320|protein_coding|-|132EE>132D*|201363379CC>AA,missense|TNNT2|ENST00000509001|protein_coding|-|162EE>162D*|201363379CC>AA,missense|TNNT2|ENST00000438742|protein_coding|-|156EE>156D*|201363379CC>AA,missense|TNNT2|ENST00000455702|protein_coding|-|167EE>167D*|201363379CC>AA,3_prime_utr|TNNT2|ENST00000663843|NMD

It looks like the amino acid change and DNA change is correct, but it would be nice if the consequence would have the more impactful consequence between the missense and the stop gain or maybe use the &.

This was using bcftools 1.16 and this was the command csq -l -f ./genome.hg38rg.fa -g ./Homo_sapiens.GRCh38.107.gff3.gz ./test.vcf

Thanks Jake

pd3 commented 1 year ago

If the MNV is in a single VCF record, there should be no difference between the local mode and the haplotype-aware mode. The latter impacts only variants present in seperate VCF records.

This indeed is a bug, both modes should have reported stop_gained. This is now fixed, thank you for the bug report.

JakeHagen commented 1 year ago

Wow that was quick. Thanks a lot