Closed lauraluebbert closed 1 year ago
Thanks for this and the PR!
I like how the sets for B,J,Z are built as unions of other (unambiguous) codons. Note that the reverse-translation in these cases will not be degenerate, it will pick one possible codon among all unambiguous possibilities (either deterministically or randomly depending on the reverse_translate
parameters).
For X
however this will introduce degenerate nucleotides N
into the framework, which is a bit trickier. And beware that X
excludes stop codons TGA TAG TAA and so is not strictly reverse-translatable to NNN
. An simpler support for X
would be "the union of all other amino-acid codon sets", and then the reverse translation would pick one (deterministically or randomly).
Would this suit your use case? Is it for use in DnaChisel or are you using reverse_translate()
independently? Asking because I think there are ways to make DnaChisel "generate anything that's a X (or a B or Z) in the final sequence".
Thanks for the quick response!
You're right; that is a better solution for X
. I adjusted my PR accordingly.
I am currently only using the reverse_translate()
function independently
Merging was delayed due to another PR. Added a bit of test to ensure all future version can handle ambiguous aa. Thanks for the contribution.
It would be great if the
reverse_translate
function could handle unknown amino acids in the sequence denoted by 'X', which would be reverse translated to 'NNN'.