translated database mappers should allow user to specify the genetic code

biocore / qiime

Official QIIME 1 software repository. QIIME 2 (https://qiime2.org) has succeeded QIIME 1 as of January 2018.

GNU General Public License v2.0

286 stars 267 forks source link

translated database mappers should allow user to specify the genetic code #599

Closed gregcaporaso closed 10 years ago

gregcaporaso commented 11 years ago

Any translated mapping done via map_reads_to_reference.py should allow users to specify the genetic code that should be used for the six-frame translation step. Currently we're defaulting to the vertebrate nuclear genetic code, but we should likely be defaulting to Bacterial Nuclear and Plant Plastid (code 11 in cogent.core.genetic_code).

I did some comparisons of randomly selected metagenome reads with tblastx via NCBI, and didn't notice big differences, so I don't think this would explain any mapping results where fewer than expected sequences are hitting the reference, but it should still be modified for flexibility and so e-values and percent IDs are accurate.

adamrp commented 11 years ago

I'll change the app controller to import GeneticCodes from cogent.core.genetic_code

Because some genetic code names are very long (e.g., "Mold, Protozoan, and Coelenterate Mitochondrial, and Mycoplasma/Spiroplasma Nuclear"), I think it will be best to accept the genetic code selection as the ID, and refer users to this page (http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi#SG1) when necessary. Does this sound reasonable?

gregcaporaso commented 11 years ago

Yes, that sounds good. Can you include that link in the help text associated with that option?

gregcaporaso commented 11 years ago

@adamrp, what's the current status on this?

adamrp commented 10 years ago

Ehh I'm not sure what's going on. @ElDeveloper tried to help me figure out what happened, but we are not sure. The code was merged (see https://github.com/qiime/qiime/blame/6d9f59bdda7b412d0efed6e2534439effbe78095/scripts/map_reads_to_reference.py), but then it was completely blown away somehow, not even showing up in history, blame, or log! Anyways, I still have the branch lying around luckily, so I'll submit a new pull request shortly. How odd!