microbiomedata / mixs-6-2-release-candidate

Proposed, Harmonized MIxS 6.2
https://github.com/GenomicsStandardsConsortium/mixs6.2_release_candidate
MIT License
5 stars 0 forks source link

Standardize taxon identifiers #184

Open turbomam opened 1 year ago

turbomam commented 1 year ago

https://www.ncbi.nlm.nih.gov/books/NBK21100/ says

Taxids are indexed with the prefix txid: txid9606 [orgn].

Source organism modifiers are indexed in the [properties] field, and such queries would be in the form: src strain[prop], src variety[prop], or src specimen voucher[prop]. These queries will retrieve all entries with a strain qualifier, a variety qualifier, or a specimen_voucher qualifier, respectively.

All of the organism source feature modifiers (/clone, /serovar, /variety, etc.) are indexed in the text word field, [text word]. For example, one could query GenBank for: “strain k-12” [text word]. Because strain information is inconsistent in the sequence databases (as in the literature), a better query would be: “strain k 12”[word] OR “strain k12”[word]. Note: explicit double-quotes may be necessary for some of these queries.

turbomam commented 1 year ago

MIxS provides the Example "Gut Metagenome [NCBI:txid749906]" for samp_taxon_id

But does anybody else in the world use this notation? Is there a resolver somewhere?

https://www.ncbi.nlm.nih.gov/search/all/?term=txid749906[orgn]

EBI OLS recommends NCBITaxon:749906

try "Gut Metagenome [NCBITaxon:749906]"