Closed gregcaporaso closed 10 years ago
I'll change the app controller to import GeneticCodes from cogent.core.genetic_code
Because some genetic code names are very long (e.g., "Mold, Protozoan, and Coelenterate Mitochondrial, and Mycoplasma/Spiroplasma Nuclear"), I think it will be best to accept the genetic code selection as the ID, and refer users to this page (http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi#SG1) when necessary. Does this sound reasonable?
Yes, that sounds good. Can you include that link in the help text associated with that option?
@adamrp, what's the current status on this?
Ehh I'm not sure what's going on. @ElDeveloper tried to help me figure out what happened, but we are not sure. The code was merged (see https://github.com/qiime/qiime/blame/6d9f59bdda7b412d0efed6e2534439effbe78095/scripts/map_reads_to_reference.py), but then it was completely blown away somehow, not even showing up in history, blame, or log! Anyways, I still have the branch lying around luckily, so I'll submit a new pull request shortly. How odd!
Any translated mapping done via
map_reads_to_reference.py
should allow users to specify the genetic code that should be used for the six-frame translation step. Currently we're defaulting to the vertebrate nuclear genetic code, but we should likely be defaulting toBacterial Nuclear and Plant Plastid
(code 11 incogent.core.genetic_code
).I did some comparisons of randomly selected metagenome reads with tblastx via NCBI, and didn't notice big differences, so I don't think this would explain any mapping results where fewer than expected sequences are hitting the reference, but it should still be modified for flexibility and so e-values and percent IDs are accurate.