Open ramonawalls opened 9 years ago
Comment by GoogleCodeExporter Thursday Jun 11, 2015 at 23:48 GMT
An initial stab...
From Wikipedia:
----------
The definition given by NCBI is:[2]
"Taxonomic level of sampling selected by the user to be used in a study, such
as individuals, populations, species, genera, or bacterial strains."
Another definition:[3] [ Wooley, John C. "A Primer on Metagenomics". PLOS
Computational Biology. Retrieved 14 November 2012.]
"Operational taxonomic unit, species distinction in microbiology. Typically
using rRNA and a percent similarity threshold for classifying microbes within
the same, or different, OTUs"
-----------
Perhaps to generalise the def...
OTU := a taxonomic unit which includes biological entities that possess an
arbitrarily defined minimal similarity to one or more biological entities used
as OTU references, as determined through a planned process of OTU
identification.
...where biological entities can be individuals, populations, communities, etc.
...the reference biological entity can have_role "OTU reference"
...taxonomic unit would be the superclass
...one could create a process "OTU definition process" which has as parts
concepts such as "similarity threshold", the algorithm used, etc.
...Comment: usually relevant to the definition of taxonomic units based on
nucleotide sequence similarity of a suitable molecular marker gene such as the
16S rRNA gene.
In fact, a sub-class specific to the OTU used by the meta-omic community would
be needed to describe there studies (and thus capture what OBI is trying to do):
"sequence-similarity-based OTU":= an OTU which includes biological entities
that possess and arbitrarily defined minimal sequence similarity to the
sequence(s) of (part of) a phylogenetic marker gene of one or more biological
entities used as OTU references, as determined through a planned process of OTU
identification.
Cursory, but a place to kick off. Note that different OTU generating algorithms
will treat the whole thing (sequences, references, thresholds, clustering)
differently. Also, the algorithm may have only one "reference sequence" which
is, in fact, an information artefact derived from multiple, identical copies of
a gene in a collection of microbes.
Original comment by p.buttig...@gmail.com
on 6 Feb 2015 at 5:22
Also see OBI's http://purl.obolibrary.org/obo/OBI_0001968 for OTU matrix.
Now that it is 2020, we should also add a term for ASV (amplicon sequence variant)
I am debating whether to make this a subclass of taxon as collection of organisms (TACOO) or information content entity. For the use cases I am aware of, I think TACOO makes a better parent. For example:
@pbuttigieg @cmungall can you think of other use cases where and ICE makes more sense?
I am going to postpone this to the next release so I have more time to think about it.
Issue by GoogleCodeExporter Thursday Jun 11, 2015 at 23:48 GMT Originally opened as https://github.com/rlwalls2008/pco/issues/14
Original issue reported on code.google.com by
rlwalls2...@gmail.com
on 17 Nov 2014 at 5:39