Bioconductor / VariantAnnotation

Annotation of Genetic Variants
https://bioconductor.org/packages/VariantAnnotation
27 stars 20 forks source link

REFCODON not consistent with REF #43

Open onebeingmay opened 4 years ago

onebeingmay commented 4 years ago

Hello VariantAnnotation team, I am using the predictCoding function to annotate the coding consequence of SNPs in my VCF file using a custom GTF file. Here is what I ran:

txdb = makeTxDbFromGFF(file="my_genepred.gtf", format="gtf")
fa = open(FaFile("GRCh37.primary_assembly.genome.fa"))
vcf_file = readVcf(file = "my_vcf.vcf")
effects = predictCoding(vcf_file, txdb, fa)

In the result "effects" object, most of the annotation was correct, but some REFCODON or VARCODON were not consistent with REF or ALT. For example, here is the SNP annotation of one gene (I only showed the 'GENEID', "POS", "REFCODON", "VARCODON", "REF", "ALT" columns).

GENEID     POS REFCODON VARCODON REF ALT
ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG 9727738      TGA      TCA   A   G
ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG 9728777      AAA      AAA   C   T
ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG 9728786      AGT      ACT   T   G

You can see that the consequence of A>G mutation if codon TGA>TCA, which doesn't make sense? I went back to my VCF and at chr19:9727738 there is indeed A>G mutation. So it seems that there is a bug of the codon annotation.

I am also attaching the GTF annotation of this gene in case there is problem of my GTF file. chr19 master_genepred.txt transcript 9720432 9731906 . - . gene_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; transcript_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; chr19 master_genepred.txt exon 9720432 9722012 . - . gene_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; transcript_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; exon_number "1"; exon_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG.1"; chr19 master_genepred.txt CDS 9721984 9722012 . - 2 gene_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; transcript_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; exon_number "1"; exon_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG.1"; chr19 master_genepred.txt exon 9727721 9727847 . - . gene_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; transcript_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; exon_number "2"; exon_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG.2"; chr19 master_genepred.txt CDS 9727721 9727847 . - 0 gene_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; transcript_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; exon_number "2"; exon_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG.2"; chr19 master_genepred.txt exon 9728767 9728855 . - . gene_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; transcript_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; exon_number "3"; exon_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG.3"; chr19 master_genepred.txt CDS 9728767 9728799 . - 0 gene_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; transcript_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; exon_number "3"; exon_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG.3"; chr19 master_genepred.txt exon 9730108 9730258 . - . gene_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; transcript_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; exon_number "4"; exon_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG.4"; chr19 master_genepred.txt exon 9731838 9731906 . - . gene_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; transcript_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; exon_number "5"; exon_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG.5"; chr19 master_genepred.txt start_codon 9728797 9728799 . - 0 gene_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; transcript_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; exon_number "3"; exon_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG.3"; chr19 master_genepred.txt stop_codon 9721981 9721983 . - 0 gene_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; transcript_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG"; exon_number "1"; exon_id "ENST00000326044.9_2:chr19:-|22|2017:277:469|truncation|ATG.1";

Thank you!