Bioconductor / VariantAnnotation

Annotation of Genetic Variants
https://bioconductor.org/packages/VariantAnnotation
27 stars 20 forks source link

Incorrect annotation of exon/intron boundary deletion #83

Open mschubert opened 4 months ago

mschubert commented 4 months ago

Using:

I've got the following variant:

chr1:3826659-3826710 (genomic)
"CGATTCTTTACACACCCCAGTTCTTTGTGCACCCCAATTCTTTACATACCCT" (ref) -> "C" (alt)
                                                ( ^ exon 16 end <--- )

This overlaps transcript ENST00000378230, which has the following exons:

seqnames          ranges strand |         exon_id exon_rank
   <Rle>       <IRanges>  <Rle> |     <character> <integer>
    chr1 3829266-3829373      - | ENSE00001690217        15
    chr1 3826708-3826744      - | ENSE00001730754        16
    chr1 3826370-3826436      - | ENSE00001769704        17

so the deletion only partially affects exon 16, and then the intron after. The predictCoding function reports:

REFCODON: CAGGACATTCAAGGAGGGAAAGCAGCCCCTGCTGAAGCTCTGGGAATCCCGGAT (= QDIQGGKAAPAEALGIPD)
VARCODON: CGT (= R; GT from extending into the intron?)
CDSLOC: 2186-2188

The sequence of each exon listed above is:

15: GCACGGAGAAAAGCGGCTACAGAAGAAGCAGAAAAACAAAAGAAAGAAGAAATAAAAGCTTTACAAGGGCAGCTGGCAGCACTGAAAGAAATTCAGGCTGAAGTTCAG
16: GAAAAAGAAAGTGATGCTGTGAAGCCAAAGAATCAGG [AGG = revcomp of variant end]
                                     ^ refcodon start
17: ACATTCAAGGAGGGAAAGCAGCCCCTGCTGAAGCTCTGGGAATCCCGGATGAGCACTATCTAGATAA
                                        refcodon end ^ (but this exon should not be altered)

So this looks like VariantAnnotation incorrectly extends the REFCODON from one exon to the next, while the VARCODON extens into the (deleted part of the) intron?