lauringlab / CodonShuffle

MIT License
7 stars 11 forks source link

problem with sort and codon usage #5

Open pkvaz opened 6 years ago

pkvaz commented 6 years ago

Hi, I got an error message with the term 'sort' that is used three times in the script. It appears the term is deprecated and now need to use either sort_values or sort_index. Which should be used in each instance? I've tried a few different combinations until i got an output, but i'm not sure that's the best approach.

Second question, possibly connected, in my output i've had difficulty obtaining shuffled sequences that retain their codon usage bias, even when they have identical CAI and ENC scores. I've tried all four permutation scripts, so i wondered if this was normal or possibly something i'm doing wrong? any suggestions? cheers

alauring commented 6 years ago

Thanks for your message. It is my understanding/memory that the "sort" command is mainly used to order the output files. As you suggest, they can either be by the index or the actual value of each individual sequence. It does not make a difference in terms of the output.

As for your second question, I am a little confused. The permutation scripts in CodonShuffle just shuffle, or rearrange, the codons in a sequence. The codon usage bias, as assessed by CAI and ENC, is largely preserved. This can be seen in Figures 3 and 4 of the manuscript. While it is inevitable that there will be small differences in the actual metrics (the exact numbers of each codon are not the same), they are quite close to the original sequence.

I also forwarded you message to danielmjorge (danielmacedo.jorge@gmail.com) who wrote the code in case he has anything additional to add.

danielmjorge commented 6 years ago

Thanks for your message. About the first question. Which is the version of Python that you are using? We used the version 2.7 for the code and the version 3.x seems to be a little bit different, that could explain error with the sort command. When do you run the script did you get the final graphs or you get any error from the script? Thank you.

pkvaz commented 6 years ago

Hi! thank you both so much for your quick responses, i've been swamped and hadn't had a chance to get back to this til now. In response to Daniel's questions, i've been using Python version 2.7 as i'd heard about issues with 3. I was getting an error with the graphs, but adjusted the code to:

ggsave(cai_graph, cai_graphname)

    cai_graph.save(cai_graphname)

and that seemed to help get graphics output. You're right Adam, that the ENC and CAI are largely preserved and does match the output from your paper, but i was wondering if it was possible to obtain an output where the actual codon usage was identical to input using this script? We were interested in trying to deoptimise the codon pair bias but retain identical codon usage.

alauring commented 6 years ago

If I am understanding your question, this is exactly what codonshuffle will do. The permutation scripts just change shuffle the existing codons around. So the overall codon usage of the gene in question will not change. There could be slight (1-2 codon) changes in ENC or CAI metrics given issues at ends I think (and requirement for AUG). However the usage should be pretty much the same, while the codon pair bias will vary.

On Aug 2, 2018, at 1:34 AM, pkvaz notifications@github.com wrote:

Hi! thank you both so much for your quick responses, i've been swamped and hadn't had a chance to get back to this til now. In response to Daniel's questions, i've been using Python version 2.7 as i'd heard about issues with 3. I was getting an error with the graphs, but adjusted the code to:

ggsave(cai_graph, cai_graphname)

cai_graph.save(cai_graphname) and that seemed to help get graphics output. You're right Adam, that the ENC and CAI are largely preserved and does match the output from your paper, but i was wondering if it was possible to obtain an output where the actual codon usage was identical to input using this script? We were interested in trying to deoptimise the codon pair bias but retain identical codon usage.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lauringlab/CodonShuffle/issues/5#issuecomment-409811042, or mute the thread https://github.com/notifications/unsubscribe-auth/AKC-HLSP0d4wr3a7aFdltCp-Ix7TYCdLks5uMo9ogaJpZM4U0k4Q.