Closed Lix1993 closed 4 years ago
Thanks for the PR, I would accept it, but could you explain or give a reference explaining the need ? Is it a common thing to strictly avoid rare codons?
remove rare codons will increase proterin expression, based on our experimental results
I am not against the feature, but I have two objections to the implementation:
Therefore I would suggest, instead of adding a parameter to CodonOptimize, to create a new specification AvoidRareCodons(species="e_coli", min_frequency=0.20)
that can be used in addition of CodonOptimize (or instead of CodonOptimize).
Would that make sense?
I think it may help.
In addition, when dealing with multiple optimization problems, boots
cannot represent problem weights since their scores are not in same range.
For example: codon_usage may get a score -200 while avoid_hairpin get a score below 10.
Do you have any suggestion to make optimization problem more 'equal'?
Ok, I am working on the library right now so I'll add a AvoidRareCodons specification, which you'll be able to use on top of other optimization methods.
Regarding the objectives scores and weights, it is true that different optimization objectives have typical scores in different ranges, as they are not always easy to compare with one another, and there is no other way right now than to play around with the boost
parameter. I recognize this is an issue and I am open to suggestions. Right now, specification scores are designed so that, ideally, a nucleotide mutation should contribute between 0 and +1 to the overall score. However, that doesn't make every specification "comparable". Let me know if you have a particular example in mind where this could be a problem.
closing this in favor of the new AvoidRareCodons specification class.
Thanks for your help.
I haven't dealing with multi-objectives right now. But it's my purpose.
Our goal is to optimize a cds sequence using different objectives with a uniform weight .
We will then use experiments to determine which functions primarily affect protein expression.
Then we'll reoptimize sequence with difference weight.
I'm currently being dealing with specific features's evaluate function. Such as TAI, leading peptide...
When hand on multi-objective problem, I'll paste an examples here.
Sorry for my poor English..
No worries, it is all clear and I really appreciate your suggestions, let me know if you run into more problems or improvement ideas.
for codons which usage frequency below a threshold (such as 0.1), set these usage to 0,
since
get_codons_table()
is a staticmethod , useremove_codons_below_threshold()
whencodons_usage_threshold > 0