Adibvafa / CodonTransformer

CodonTransformer: The ultimate tool for codon optimization, optimizing DNA sequences for heterologous protein expression across 164 species.
https://adibvafa.github.io/CodonTransformer
Apache License 2.0
102 stars 3 forks source link

Add option to avoid user-defined restriction sites #6

Open VittorioRainaldi opened 1 month ago

VittorioRainaldi commented 1 month ago

Hello,

for my cloning I typically use golden gate with BsaI, so it would be convenient to filter for user-defined restriction sites.

Best, Vittorio

Adibvafa commented 1 month ago

Hello! Thank you for opening an issue. That's a great suggestion, we will add non-deterministic generation with an option to find the best sequence without user-defined restriction sites. Do you have any specific list of restriction sites in mind that we can test on?

gui11aume commented 1 month ago

This would be a feature of "forbidden" sequence that would go beyond restriction sites. If the user does not want a sequence in their construct, they should have a way to specify it. The first option is to find those sequences in the output and make the one-nucleotide change with the minimal increase in loss. The sequence would no longer be present in the output, and we would get a near-optimal encoding (optimal in terms of loss for the model). The second option is to use non-deterministic mode to produce multiple variants, sort them by loss (or likelihood, it's the same thing) and pick the top output that does not have the sequence.

We do not need a concrete set of restriction sites to get started with this.