Open GoogleCodeExporter opened 9 years ago
Add table taxon_resources (taxon_id,resource_name varchar,resource_uri varchar)
to
allow users to point a taxon at external resources such as uBio, IPNI, etc.
Original comment by dust...@gmail.com
on 18 Sep 2008 at 6:46
Further explanation by Dusty in email to Carla 13 Nov 2008:
The proposed taxonomy solution removes the concept of an accepted or unaccepted
record from the taxonomy table and creates collection/taxonomy tables by which
collections can "claim" taxonomy by establishing collection-specific
relationships
among taxon terms.
So, given the following data in table taxonomy:
SomeHigheraxonTerm ScientificName
z a
y b
x b
Term a is available for use by any collection. It is unique; there are no
conflicts.
Terms b(x) and b(y) are also "available," but may not successfully be used by
bulk
Arctos application because they are not unique, and therefore not
distinguishable by
ScientificName. (This constraint does not apply for single-record updates where
a
person is available to choose from the possibliities.)
So we would also add collection-specific relationships:
GoodScientificName BadScientificName Collection
b(x) b(y) 1
b(y) b(x) 2
Now, Collection 1 can find the "good" (in their opinion) version of b (b(x))
through
a lookup on the scientific name b.
Collection 2 will, with the same lookup, get a different "taxon concept," b(y),
as
"their" name when querying on "b".
A record of taxa will be maintained through relationships.
All that said, the structure will be mostly unimportant to users. Users will be
able to:
Create taxonomy, including "alternate opinions" that share scientific name
"Edit" taxonomy
"Claim" taxonomy as valid for a specific collection
Locate specimens by current, historical, previously applied, or related names
Share taxonomy with anyone
Original comment by carla...@gmail.com
on 20 Nov 2008 at 11:02
Further point for consideration: Can we simply continue to use what we have?
--
Gordon sez: I'm losing interest in supporting conflicting hierarchies: Hanner
has me
convinced that this sport is going fade. Aside from cases of reticulate
evolution
(which are largely at the tips of the branches), the tree of life has to have
one
true topology, and the BarCode of life plus the 10K Genome Project may reveal
most of
that as a simple by-product of trying to do something useful. We need to do
something that non-specialists can use, and having GenBank, BoLD, EOL, Arctos,
etc.,
all un-synched is seriously confusing. As far as I'm concerned,
source/authority
applies to scientific_name, and he rest is a band-aid to get us by until we (or
somebody) can do something truely authoritative. Right now, we're trying to
get some
kind of complete higher taxonomy with nomenclatural codes so we can make shtuff
work.
And even that is taking us years.
--
Under this model, we could also alter table taxon_relations, changing
related_taxon_name_id to related_taxon_uri. That would allow us to relate
records to
both taxa within Arctos and to external resources (which might record alternate
opinions about higher taxonomy).
Original comment by dust...@gmail.com
on 21 Apr 2009 at 8:36
[deleted comment]
Summary from Arctos pow-wow@MVZ:
Comment 3 probably won't work; we can't all just agree to get along.
A simple tagging system, where Arctos maintains a list of unique scientific
names and
a "cluster of assertions" probably won't work either, as MVZ (1) wants to assert
taxon concepts, and (2) can't find their specimens without doing so.
That brings us back to something like the initial proposal.
Clarified partial Functional Requirements:
1) Locate specimens by any relevant Higher Taxonomy assertion. So, searching for
"Muridae" returns those records that are currently in "Cricetidae" but that
anyone
ever thought were in Muridae.
2) Find stuff in those collections that are organized by higher taxonomy, which
means
asserting concepts along with names.
3) Use taxon concepts. Functionally, this means that rows can never change. In
the
following example, all rows are equally valid and we cannot "fill in the
blanks" in
rows 1 and 2.
Row---Scientific_name----Family----Order-----Suborder
1-----A------------------X-------------------Z
2-----A----------------------------Y---------Z
3-----A------------------X---------Y---------Z
Immediately practical questions:
1) Does this mean we should stop trying to fill in the blanks for our current
data?
Given the current constraint of maintaining one globally unique scientific
name, can we?
2) How will this affect other applications? We can't properly generate formatted
names without a nomenclatural_code (see Issue 242), and some external
applications
(like Ornis) demand Class. Neither of these fields is provided by AOU, for
example.
Can we cheat? If so, how do we formalize what's available for interpretation and
what's locked into the Source? Or do we need some more-elaborate structure to
separate the things provided by a Source from the functional things demanded by
various applications? How far do we wish to take this idea? AOU provides only 4
terms
(Order, Family, Genus, Species). That, I believe, is bordering on not enough
information to be useful.
I believe we are still lacking workable functional requirements, and those are
absolutely necessary before this discussion can conclude.
Further considerations:
1) Should we invest the time to publish this as a webservice?
2) If we proceed with (1), should we open up editing (to the extent we allow
editing)
to the broader community?
(DLM votes "yes" on both.)
-----------------------------------------------------------------------
Somewhat related, the format for taxon_relations should be:
ID (NOT NULL: FKEY taxon_name_id)
Related_ID (NOT NULL: FKEY taxon_name_id)
Relationship (NOT NULL: FKEY, cttaxon_relations)
Whodunit (NOT NULL: FKEY AGENT_ID)
WhenDunit (NOT NULL: DATE)
Authority (NULL, text, hopefully a citation)
Authority_type (NULL, FKEY ctNewCodeTable_TaxonRelationsType)
Authority_Type is an attempt to further quantify the validity or value of a
relationship. Possible values include misspelling, alternate spelling, code
revision,
checklist update, publication, and personal assertion.
Original comment by dust...@gmail.com
on 4 May 2009 at 8:01
Original comment by dust...@gmail.com
on 24 Sep 2009 at 12:20
For searching, we need to also be able to crawl "nodes." Given:
Specimen-->TaxonA
TaxonA-->SomeRelationship-->TaxonB
TaxonA-->SomeRelationship-->TaxonC
TaxonB-->SomeRelationship-->TaxonD
TaxonD-->SomeRelationship-->TaxonE
TaxonE-->SomeRelationship-->TaxonF
...
we need to be able to find the specimen by any attribute of any of the involved
taxa.
Furthermore, we need to be able to crawl relationships "backwards" - we need to
find
specimens attached to TaxonF by TaxonA attributes.
We may also need to prioritize results by distance and relationship type.
Original comment by dust...@gmail.com
on 13 May 2010 at 6:58
Original issue reported on code.google.com by
dust...@gmail.com
on 18 Sep 2008 at 6:15