Closed laserson closed 7 years ago
I think the official denotation of an unknown amino acid is X
. My feeling is that -
is probably appropriate as well, if derived from a gap rather than an ambiguous nucleotide.
By out-of-frame do you mean that the rearrangement is non-productive? Or are you referring to the uncommon case where indels puts the rearrangement out-of-frame but it still happens to be productive? In the case of a non-productive rearrangement, I don't think it is necessary to translate CDR3.
My feeling is that we don't mandate this for now, and it can be a use-at-your-own-risk situation? Though I'll let @mikessh chime in here.
@schristley Doesn't the tool that determines if its productive or unproductive has to translate it and determine unproductive (due to stop codon) in a particular frame?
@nishmm There are other ways for a rearrangement to be marked unproductive, but yes one of steps would be to check CDR3 for a stop codon after translation. The discussion we had in iReceptor the other day sounds like it is specific to MIXCR (other tools?), it might conclude that a CDR3, which has a length not divisible by 3, is actually valid with the missing/additional nucleotides assumed to be sequencing errors. This was how I understood Bojan's description. Other tools may be different.
This seems to be tool-dependent to some extent, and probably addressed using custom fields. Not sure there is any actionable thing here, so I am closing.
Shugay: