Edinburgh-Genome-Foundry / DnaChisel

:pencil2: A versatile DNA sequence optimizer
https://edinburgh-genome-foundry.github.io/DnaChisel/
MIT License
213 stars 38 forks source link

Custom Codon Usage tables for Codon Optimization #46

Closed eggrandio closed 3 years ago

eggrandio commented 3 years ago

Hello,

I just found the DNA Chisel python library and it looks like it's exactly what I was looking for!

I want to codon optimize the sequence of some proteins for expression in plants, but I would like to use "custom" codon usage tables. In the help files I found that "You can also use a TaxID to refer to a species, e.g. species=1423 at which case the codon frequencies will be downloaded from the Kazusa codon usage database (assuming it isn’t down!)". Data from that webpage is vastly outdated (the last data they used to generate the codon usage tables is from 2007, and for some species there are many more sequences available that allow the generation of more reliable codon usage tables). Moreover, I have built codon usage tables for groups of related plant species and would like to test if they work.

My question is, is there any way of "loading" a custom codon usage table (I could prepare them in the same format as the Kazusa webpage). However, I am not proficient in Python and would not know even where to start...

Best,

veghp commented 3 years ago

Hi, thanks for the interest in this. I think the documentation has what you are looking for: Codon Optimization Specifications. Specifically, parameter codon_usage_table: Optional codon usage table of the species for which the sequence will be codon-optimized, which can be provided instead of species. A dict of the form {'*': {"TGA": 0.112, "TAA": 0.68}, 'K': ...} giving the RSCU table (relative usage of each codon).

So you will need to build a dictionary in that form, then pass it to the function:

my_codon_table =  {'*': {"TGA": 0.112 ...
dnachisel.builtin_specifications.CodonOptimize(codon_usage_table=my_codon_table)

In what format do you have the codon usage table?

eggrandio commented 3 years ago

Hi Peter,

Thanks for the reply! That is exactly what I was looking for. I have the tables in GCG Wisconsin Package format, but it should be very easy to convert them to a dict.

Best,

On Thu, Oct 8, 2020 at 10:00 AM Peter Vegh notifications@github.com wrote:

Hi, thanks for the interest in this. I think the documentation has what you are looking for: Codon Optimization Specifications https://edinburgh-genome-foundry.github.io/DnaChisel/ref/builtin_specifications.html?highlight=codon#codon-optimization-specifications. Specifically, parameter codon_usage_table: Optional codon usage table of the species for which the sequence will be codon-optimized, which can be provided instead of species. A dict of the form {'': {"TGA": 0.112, "TAA": 0.68}, 'K': ...} giving the RSCU table (relative usage of each codon).*

So you will need to build a dictionary in that form, then pass it to the function:

my_codon_table = {'*': {"TGA": 0.112 ...dnachisel.builtin_specifications.CodonOptimize(codon_usage_table=my_codon_table)

In what format do you have the codon usage table?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Edinburgh-Genome-Foundry/DnaChisel/issues/46#issuecomment-705700372, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL2GBFF3DCBCXX5RYKVETBTSJXV27ANCNFSM4SIBB5EA .