tdwg / PlinianCore

A task group of the "Species Information Interest Group" set to develop a set of vocabulary terms that can be used to describe different aspects of biological species
Apache License 2.0
15 stars 6 forks source link

Recover the previous "Description Types" that were developed in GBIF and handle appropriately #8

Open CynthiaParr-USDA opened 6 years ago

CynthiaParr-USDA commented 6 years ago

Based on Chuck's problem, developed by Eamonn who isn't with GBIF anymore, and can't be found. Some relationship to the Species Profile Model.

Mapping is straightforward, says Paco.

Outcome may be a mapping document in this GitHub archive and/or discussion.

CynthiaParr-USDA commented 6 years ago

From our notes: Chuck: involved with a big project that was following the GBIF species profile and description types. Can’t find species profile/description types. [Inquire with GBIF admins] Work was led by Eamonn O’Tuama, David Remsen. (EOL involvement?) Was it in GBIF vocabulary server? (Paco said yes) -- so Dag Endresen might have some information

MattBlissett commented 6 years ago

Is this what you mean? https://rs.gbif.org/extension/gbif/1.0/speciesprofile.xml

CynthiaParr-USDA commented 6 years ago

Or maybe there are relevant links here: http://eol.org/info/toc_subjects from this old document from the early days of EOL.

ckmillerjr commented 6 years ago

Yes, EOL used the same GBIF Description Types.

Chuck

On Oct 5, 2017, at 2:57 PM, Cyndy Sims Parr notifications@github.com<mailto:notifications@github.com> wrote:

Or maybe there are relevant links here: http://eol.org/info/toc_subjects from this old document from the early days of EOL.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/tdwg/PlinianCore/issues/8#issuecomment-334559737, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AOrGgx6LVmQKeweB04xuyzDF37OHmuVpks5spSaDgaJpZM4PvUsN.

WUlate commented 3 years ago

The XML page where the definitions of the concepts are contained is "http://rs.gbif.org/vocabulary/gbif/description_type.xml" (with the underscore and lowercase "t"), but the URIs of the elements inside that same XML document are clearly defined with a format of "http://rs.gbif.org/vocabulary/gbif/descriptionType/* (no underscore and capital "T").

Note also that http://rs.gbif.org/vocabulary/gbif/description_Type/legislation, for example, doesn't resolve to anything (because it should have a lowercase "t") and the lowercase "t" URL version of it resolves to the XML page http://rs.gbif.org/vocabulary/gbif/description_type.xml. In fact, all "http://rs.gbif.org/vocabulary/gbif/description_type/*" resolve to the same XML document (with URIs without underscores and capital "T").

Archilegt commented 1 year ago

I see a frequent confusion popping up above. The Species Profile Extension and the Species Profile Model are different things. Species Profile Extension: http://rs.gbif.org/extension/gbif/1.0/speciesprofile.xml Species Profile Model: It became the "Description Type GBIF Vocabulary", currently at https://rs.gbif.org/vocabulary/gbif/description_type.xml

The Species Profile Extension has 14 terms (properties): six boolean (isMarine, isFreshwater, isTerrestrial, isInvasive, isHybrid, isExtinct), four as text strings and comma-separated values (livingPeriod, lifeForm, habitat, sex), three numeric (ageInDays, sizeInMillimeters, massInGrams), and one unique identifier (datasetID).

The Species Profile Model originally had 33 "types" according to Chuck Miller @ckmillerjr. See this thread. It would be good to check if the SPM was discussed at the TDWG Meeting in Dunedin, New Zealand, 2018, and if anyone kept notes.

The SPM is in use by the Scratchpads Taxon Description content type. It includes six groups of terms and 36 fields (so three more than the original 33): Overview (General description, Biology), Conservation (Conservation status, Legislation, Management, Procedures, Threats, Trends), Description (Diagnostic description, Behaviour, Cytology, Genetics, Growth, Look alikes, Molecular biology, Morphology, Physiology, Size, Taxon biology), Evolution and Systematics (Evolution, Phylogeny), Ecology and Distribution (Dispersal, Associations, Cyclicity, Distribution, Ecology, Habitat, Life cycle, Life expectancy, Migration, Trophic strategy, Population biology, Reproduction), and Relevance (Diseases, Risk statement, Uses).

The Description Type GBIF Vocabulary (= SPM) had 38 terms (concepts) already in 2018, as mentioned by Chuck Miller. There are still 38 terms in 2022. Those are: general, diagnostic, morphology, habit, cytology, physiology, size, weight, lifespan (= life expectancy), lifetime, biology, ecology, habitat, distribution, reproduction, conservation, use, dispersal, cyclicity, lifecycle, migration, growth, genetics, chemistry, diseases, associations, behaviour, population (= population biology), management, legislation, threats, typematerial, typelocality, phylogeny, hybrids, literature, culture, and vernacular (= vernacular names).

Remark: There are no 33 shared terms between the Species Profile Model as implemented in Scratchpads and the current Description Type GBIF Vocabulary.

Some immediate challenges:

  1. We don't know which the original 33 terms of the Species Profile Model were. We need those and their documentation. @ckmillerjr, could you lend us a hand here, please?
  2. The Description Type GBIF Vocabulary requires revision and cleaning on itself, with special attention to the coherence of the "Alternative Terms" and the meaning of the main terms (e.g., a reference glossary). Most of the term descriptions are tautological with respect to the terms. @MattBlissett, could you help with this, please? Also, some terms are nomenclatural / taxonomic and are covered by other standards / extensions (typematerial, typelocality, vernacular). Literature is covered by a separate extension.
  3. After 1 and 2 are solved, the Species Profile Model (Scratchpads) and the Description Type GBIF Vocabulary need to be harmonized and merged or mapped, at least for GBIF purposes. I can help with this, but I will need the input of more people.

Hopefully, after all this is done, we will manage to map the harmonized SPM to PlinianCore. From the Scratchpads user side, I would aim at upgrading the Taxon Description content type for Scratchpads 3.0 as per PlinianCore rather than as per the SPM. We are not that many people and it is better to concentrate the effort in maintaining one Taxon Description standard rather than two. Does this make sense?

ckmillerjr commented 1 year ago

Archilegt, I’m very fuzzy on where I found the reference to 33 Description Types in the GBIF Description Types list. I think the original source of that at GBIF’s website has been deprecated, or I got wrong information from somewhere. There are currently 38 types in the GBIF vocabulary as you state. We use the same 38 in the World Flora Online DwCA file exchange.

The origin of SPM goes back to LSIDs. I found this slide deck from the late Bob Morris (RIP Bob): https://slideplayer.com/slide/8528971/ at TDWG 2008 (https://static.tdwg.org/conferences/2008/tdwg_2008_proceedings.pdf, page 47) I can’t find a way to download it, but it conveys the original concept of SPM in association with LSID ontologies and RDF and Bob says that Roger Hyam started SPM in 2007. I think what became DescriptionTypes were originally called InfoItems, but it’s not a 1:1 match. So, the trail for the genesis of Description Types is long and you might find more information searching through the LSID historical records. Here’s some breadcrumbs: https://github.com/tdwg/wiki-archive/blob/master/twiki/data/SPM/WebHome.txt, https://github.com/tdwg/wiki-archive/blob/master/twiki/data/TAG/TDWGOntology.txt, https://github.com/tdwg/ontology/blob/master/ontology/voc/SPMInfoItems.rdf.

Sorry if I have introduced a distraction into your work.

Chuck

MattBlissett commented 1 year ago

Hi,

There's a newer version of the Species Profile extension, https://rs.gbif.org/extension/gbif/1.0/speciesprofile_2019-01-29.xml

rs.gbif.org is in GitHub if you need to look at the history: https://github.com/gbif/rs.gbif.org

Changes should be proposed in Git following the instructions at https://github.com/gbif/rs.gbif.org/blob/master/versioning.md. I can follow the software part of this, but I'm not a taxonomist so I would prefer not to be involved in writing the descriptions. @mdoering may have an opinion, as he works on GBIF's Checklistbank.