Closed zktuong closed 11 months ago
Yes, I think you are indeed missing something:
Please note that almost all of your mutations in questions are on transcripts that are coded on the - strand of the corresponding DNA so you need to check the reverse complement.
The remaining mutations in question (which are on the + strand) are resulting from indels/frameshifts for which your code defaults to (0, 0).
For your reference I attached a table with the strand information of the transcripts that you are suspecting incorrect variants: mart_export_GH_68.txt
HTH
oh ok thanks! that makes a lot of sense!
Hi,
We (@ali-harasty and myself) are getting confused by some of the results in the final neoantigen results table e.g.
MHC_Class_I_all_epitopes_ccf_ref
file.Some of the mutations makes sense but some don't.
For example in this following screenshot, the first row is a C->G nucleotide mutation, resulting in a Q->E mutation at the amino acid level:
Q
can beCAA
orCAG
andE
can beGAA
orGAG
. so the change from the firstC
to the firstG
makes perfect sense. The reference sequence matching the mutated codon sequence can also be found (GRCh38.d1.vd1.fa
from the nextNEOpi resource bundle).However, if you look at the second one, it's a G->A for a R->C mutation
However, there's no way this can happen as there's no
A
in the codon encoding Cysteine.R
:CGT, CGC, CGA, CGG, AGA, AGG
C
:TGT, TGC
You see this occuring a couple more times where it's just not possible.
Sometimes the reference is also not possible e.g. the 4th row's mutation (C->G/R->T). The variant here is C and on the reference the possible flanking codon sequences are
'CTT', 'TCT', 'GTC'
(Stop
column position is the correct position) but none of these match toR
. The variant here is also not possible.R
:CGT, CGC, CGA, CGG, AGA, AGG
T
:ACT, ACC, ACA, ACG
So i went through the whole table and it's just not possible for a lot of the results:
I've attached the results here and you can take a look Archive.zip
Are we missing something?