Open iDigBioBot opened 6 years ago
TestField | Value |
---|---|
GUID | 431467d6-9b4b-48fa-a197-cd5379f5e889 |
Label | AMENDMENT_SCIENTIFICNAMEID_FROM_TAXON |
Description | Proposes an amendment to the value of dwc:scientificNameID if it can be unambiguously resolved from bdq:sourceAuthority using the available taxon terms. |
TestType | Amendment |
Darwin Core Class | dwc:Taxon |
Information Elements ActedUpon | dwc:scientificNameID |
Information Elements Consulted | dwc:taxonID |
dwc:acceptedNameUsageID | |
dwc:originalNameUsageID | |
dwc:taxonConceptID | |
dwc:scientificName | |
dwc:higherClassification | |
dwc:kingdom | |
dwc:phylum | |
dwc:class | |
dwc:order | |
dwc:superfamily | |
dwc:family | |
dwc:subfamily | |
dwc:tribe | |
dwc:subtribe | |
dwc:genus | |
dwc:genericName | |
dwc:subgenus | |
dwc:specificEpithet | |
dwc:infraspecificEpithet | |
dwc:cultivarEpithet | |
dwc:vernacularName | |
dwc:scientificNameAuthorship | |
dwc:taxonRank | |
Expected Response | EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:scientificNameID is bdq:NotEmpty, or if all of dwc:scientificName, dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet, dwc:scientificNameAuthorship, and dwc:cultivarEpithet are bdq:Empty, FILLED_IN the value of dwc:scientificNameID for an unambiguously resolved single taxon record in the bdq:sourceAuthority through (1) the value of dwc:scientificName or (2) if dwc:scientificName is bdq:Empty through values of the terms dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet, dwc:scientificNameAuthorship and dwc:cultivarEpithet, or (3) if ambiguity produced by multiple matches in (1) or (2) can be disambiguated to a single Taxon using the values of dwc:subtribe, dwc:tribe, dwc:subgenus, dwc:genus, dwc:subfamily, dwc:family, dwc:superfamily, dwc:order, dwc:class, dwc:phylum, dwc:kingdom, dwc:higherClassification, dwc:taxonID, dwc:acceptedNameUsageID, dwc:originalNameUsageID, dwc:taxonConceptID, dwc:taxonomicRank, and dwc:vernacularName; otherwise NOT_AMENDED |
Data Quality Dimension | Conformance |
Term-Actions | TAXONID_FROM_TAXON |
Parameter(s) | bdq:sourceAuthority |
Source Authority | bdq:sourceAuthority default = "GBIF Backbone Taxonomy" {[https://doi.org/10.15468/39omei]} {API endpoint [https://api.gbif.org/v1/species?datasetKey=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c&name=]} |
Specification Last Updated | 2023-09-17 |
Examples | [dwc:taxonID="", dwc:scientificNameID="", dwc:acceptedNameUsageID="", dwc:originalNameUsageID="", dwc:taxonConceptID="", dwc:scientificName="Chicoreus palmarosae (Lamarck, 1822)", dwc:higherClassification="", dwc:kingdom="Animalia", dwc:phylum="Mollusca", dwc:class="Gastropoda", dwc:order="", dwc:family="Muricidae", dwc:subfamily="", dwc:genus="Chicoreus", dwc:genericName="Chicoreus", dwc:subgenus="", dwc:infragenericEpithet="", dwc:specificEpithet="palmarosae", dwc:infraspecificEpithet="", dwc:cultivarEpithet="", dwc:vernacularName="", dwc:scientificNameAuthorship="(Lamarck, 1822)", dwc:taxonRank="", bdq:sourceAuthority=”marinespecies.org”: Response.status=FILLED_IN, Response.result=dwc:scientificNameID="urn:lsid:marinespecies.org:taxname:208134", Response.comment="dwc:scientificName matched to unique taxon record in WoRMS, exact match on name and authorship. Resolvable at https://marinespecies.org/aphia.php?p=taxdetails&id=208134"] |
[dwc:scientificNameID="", dwc:taxonID="", dwc:acceptedNameUsageID="", dwc:originalNameUsageID="", dwc:taxonConceptID="", dwc:scientificName="Graphis", dwc:higherClassification="", dwc:kingdom="", dwc:phylum="", dwc:class="", dwc:order="", dwc:family="", dwc:subfamily="", dwc:genus="", dwc:genericName="", dwc:subgenus="", dwc:infragenericEpithet="", dwc:specificEpithet="", dwc:infraspecificEpithet="", dwc:cultivarEpithet="", dwc:vernacularName="", dwc:scientificNameAuthorship="", dwc:taxonRank="": Response.status=NOT_AMENDED, Response.result=, Response.comment="dwc:scientificName="Graphis" is ambiguous as could be either a lichen or a gastropod."] | |
Source | FP-Akka |
References |
|
Example Implementations (Mechanisms) | Kurator/FilteredPush sci_name_qc Library, FP-KurationServices, Arctos, MCZbase, Symbiota |
Link to Specification Source Code | https://github.com/FilteredPush/sci_name_qc/blob/v1.1.2/src/main/java/org/filteredpush/qc/sciname/DwCSciNameDQ.java#L397 https://github.com/FilteredPush/sci_name_qc/blob/v1.1.2/src/main/java/org/filteredpush/qc/sciname/DwCSciNameDQ.java#L476 |
Notes | Return a result with no value and a Result.status of NOT_AMENDED with a Response.comment of ambiguous if the information provided does not resolve to a unique result (e.g. if homonyms exist and there is insufficient information in the provided data, for example using the lowest ranking taxa in conjunction with dwc:dwc:scientificNameAuthorship, to resolve them). When referencing a GBIF taxon by GBIF's identifier for that taxon, use the the pseudo-namespace "gbif:" and the form "gbif:{integer}" as the value for dwc:scientificNameID. |
Comment by Paul Morris (@chicoreus) migrated from spreadsheet: Moving from scientificName as a string to a link to a guid in a taxonomic or nomenclatural authority is key for moving towards linked open data and other semantic delivery of biodiversity data. There is almost never enough data in flat Darwin Core to fill in any of the other ID terms in the Taxon class, but it is often possible to link scientific name strings to nomenclatural or taxonomic records.
We should add taxonRank to the list of fields for this and #70 . It is especially important for the interpretation of monomials in scientific name absent other supporting data.
@godfoder I concur. Do we need to specify a more complex set of prerequisites?
A couple of issues for implementation:
Acton to take when taxonID is NOT_EMPTY: The specification is mute on what action to take when dwc:taxonID has a value. Since other tests specify CHANGED only if term that is proposed to be amended is NOT_EMPTY, the implication is that an amendment is to be proposed, for purposes such as conforming taxonID values to a national authority. This should probably be spelled out in the notes section.
Extraneous terms in the list of Information Elements: The specification states that a proposed amendment should be based on "on the basis of the value of the lowest ranking not EMPTY taxon classification terms dwc:scientificName, dwc:scientificNameAuthorship, dwc:kingdom, dwc:phylum, dwc:class, etc.", with @godfoder's comment clearly indicating that taxonRank should be included in this list. The notes imply that none of the other ID terms (dwc:scientificNameID, dwc:acceptedNameUsageID, dwc:originalNameUsageID, dwc:taxonConceptID) should be included in this analysis, so it seems that they shouldn't be included in the informationElements, unless there is a clear specification of how to include them to infer a value of taxonID. Also, neither dwc:higherClassification nor dwc:vernacularName are included in the specification, and thus don't seem to fit in the list of information elements.
Further to the logic of @chicoreus - I don't understand the inclusion of dwc:scientificNameAuthorship as it isn't a taxon classification term in the hierarchy, and that and that field alone could not supply a taxonID.
@ArthurChapman I see scientificNameAuthorship as an essential term for identifying which taxonID to use, it can often disambuguate homonyms and if the authorship string associated with the source record for taxonID isn't the same as the authorship string in a record under consideration, then something likely isn't correct and an assertion of of a taxonID match is not a good one to make.
@chicoreus - that is correct, but it may need us to reword the test, because as written, I don't see how that field could work as dwc:scientificNameAuthorship is not strictly a classification term. dwc:scientificName should include the authorship and thus could be used to resolve the taxonID. It probably applies to a different test to use the dwc:scientificNameAuthorship to fix dwc:scientificName but as this test is written then the dwc:scientificNameAuthorship on its own doesn't work. I don't see that term belong in this test.
@ArthurChapman Yes, rewording would be good. Point well taken that the information should be in scientificName and scientificNameAuthorship should be a parse of that rather than a classification term. Pragmatically, scientificNameAuthorship makes for easier removal parsing of the canonical name and authorship part of scientificName when it contains both, (and often, despite the definitions, it doesn't), and services tend to return better results when queried just on canonical name and then have the results examined for similarity of authorship strings. A huge amount of the variability in the wild is in the authorship strings, people's names abbreviated or not, punctuation variability, presence and absence of prefixes, suffixes, and honorifics, and in animal names, the presence or absence of years, etc. Implementation logic needs to deal with this in a consistent way, not farming it off to what may or may not be returned from a particular services when given a value found in dwc:scientificName.
I defer to those far more expert on names to reword the Expected Response and tune the Information Elements accordingly. I do however offer two comments on general issues relating to this AMENDMENT-
While the precursor VALIDATION #105 tests for dwc:taxonID EMPTY, as I remember it, the 'tests' should be 'stand alone' so we should be explicit here.
I dislike the use of "etc" in the current Expected Response as these provide explicit rules for implementation. If we need to refer to a dwc term, then we MUST specify it.
I still greatly dislike the "etc" in the Expected Response. Related: I also don't like Information Elements that are missing from the Expected Response. We must to provide concise unambiguous instructions on this test (and in the test data where I revisited this), and as I am unsure how for example dwc:infraspecificEpithet comes into this...I'll leave to the NAME gurus.
This is a very difficult one. Suggested change from (may need tweaking):
EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority was not available; INTERNAL_PREREQUISITES_NOT_MET if all of dwc:kingdom, dwc:phylum, dwc:class, dwc:order, dwc:family, dwc:genus, and dwc:scientificName are EMPTY; AMENDED if a value for dwc:taxonID is unique and resolvable on the basis of the value of the lowest ranking not EMPTY taxon classification terms dwc:scientificName, dwc:scientificNameAuthorship, dwc:kingdom, dwc:phylum, dwc:class, etc.; otherwise NOT_AMENDED
to
EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority was not available; INTERNAL_PREREQUISITES_NOT_MET if all of dwc:kingdom, dwc:phylum, dwc:class, dwc:order, dwc:family, dwc:genus, dwc:genericName, dwc:specificEpithet and dwc:scientificName are EMPTY or if dwc:taxonId is not EMPTY; AMENDED if a value for dwc:taxonID is unique and resolvable on the basis of the value of dwc:scientificName, dwc:acceptedNameUsageID, or dwc:origionalNameUsageID, or an unambiguous combination of any of the lowest ranking not EMPTY taxon classification terms dwc:infraspecificEpithet, dwc:cultivarEpithet, dwc:specificEpithet, dwc:infragenericEpithet, dwc:genus, dwc:genericName, dwc:subfamily, dwc:family, dwc:class, dwc:order, dwc:phylum, dwc:kingdom, dwc:higherClassification, dwc:vernacularName, in conjunction with dwc:scientificNameAuthorship and dwc:taxonRank; otherwise NOT_AMENDED
Some of my thinking: 1). in the INTERNAL_PREREQUISITES_NOT_MET - dwc:subfamily dwc:subgenericEpithet, dwc:subfamily, dwc:infraspecificEpithet, etc. cannot on their own give you the TAXONID. With dwc:infraspecificEpithet, for example - you'd need to have at least the dwc:species.
Good luck with getting your heads around this one.
@Tasilee - if accepted it will need some work for the examples.
@chicoreus @tucotuco - does dwc:taxonConceptId come into this anywhere - i.e. to help distinguish the Taxon ID - it would possibly by beside dwc:scientificNameAuthorship and dwc:taxonRank in the "conjunction with" area. I'd prefer to leave it out.
Thanks @ArthurChapman. My eyes glaze over when it comes to names. I have enough understanding to be dangerous. I therefore defer to @ArthurChapman, @chicoreus, @tucotuco and hopefully a few more watching on for advice, or thumbs up etc.
As it stands, your Expected Response is at least explicit. My only quibble may be the use of "in conjunction with ...". How?
@Tasilee. We suggested "in conjunction with" in the ZOOM call. It means that if there is a homonym (possibly in different Kingdoms, or even within a genus) - then you would need "the lowest ranking taxon term" in conjunction with "dwc:scientificNameAuthorship" to separate the homonyms to get a TAXONID. This is not the case with dwc:scientificName for example, as this contains the Authorship by definition.
This would at least need to be in the Notes then?
@Tasilee - sounds reasonable - I suggest we will do Notes after we agree on the wording of the Expected Response.
In going through the data, there is an anomaly that is fixed if we accept the last suggested Expected Response from @chicoreus. I have just moved the "if dwc:taxonID is not EMPTY" to the start of the INTERNAL_PREREQUISITES_NOT_MET list as it makes parsing simpler.
@Tasilee I can't see the suggested Expected Response from @chicoreus to which you refer. You seem to have added my suggestion - but I am still not sure about the placement of dwc.taxonRank in the Reponse,
dwc:subfamily, dwc:genericName, dwc:infragenericEpithet and dwc:cultivarEpithet need to be added to the Information Elements and we should add something to the Notes as suggested by you above. Suggest instead of
[bdq:sourceAuthority default = GBIF Backbone Taxonomy]. (Currently found at: https://www.gbif.org/en/developer/species). This is the taxonID inferred from the Darwin Core Taxon class, not from any other sense of Taxon. Return a result with no value and a result state of ambiguous if the information provided does not resolve to a unique result (e.g. if homonyms exist and there is insufficient information in the provided data to resolve them)
we use
[bdq:sourceAuthority default = GBIF Backbone Taxonomy]. (Currently found at: https://www.gbif.org/en/developer/species). This is the taxonID inferred from the Darwin Core Taxon class, not from any other sense of Taxon. Return a result with no value and a result state of ambiguous if the information provided does not resolve to a unique result (e.g. if homonyms exist and there is insufficient information in the provided data, for example using the lowest ranking taxa in conjunction with dwc:dwc:scientificNameAuthorship, to resolve them).
@ArthurChapman for "Return a result with no value and a result state of ambiguous", we've moved away from using ambiguous as a Response.status, so this would be Response.status of NOT_AMENDED and ambiguous in the Response.comment or the proposed Response.qualifier extension (this makes a good example of a Response.qualifier=AMBIGUOUS being a good structured qualifier for the Response, with more details in the Response.comment). (Also noting, we seem to be settling on Response.result instead of Response.value or result value)
@ArthurChapman I'm not seeing it either (checked recent email threads), @Tasilee was the latest expected response from me something you wrote down in the last call?
OK - I will make a change in the Notes to:
[bdq:sourceAuthority default = GBIF Backbone Taxonomy]. (Currently found at: https://www.gbif.org/en/developer/species). This is the taxonID inferred from the Darwin Core Taxon class, not from any other sense of Taxon. Return a result with no value and a Result.status of NOT_AMENDED with a Response.comment of ambiguous if the information provided does not resolve to a unique result (e.g. if homonyms exist and there is insufficient information in the provided data, for example using the lowest ranking taxa in conjunction with dwc:dwc:scientificNameAuthorship, to resolve them).
@chicoreus - the Expected Response to which @tasilee was referring was apparently mine - not yours (I just spoke to him by phone).
Aligining with #70, specification could read:
INTERNAL_PREREQUISITES_NOT_MET dwc:taxonID is not EMPTY or if all of, dwc:scientificName, dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet, dwc:taxonRank, dwc:scientificNameAuthorship, dwc:cultivarEpithet are EMPTY, AMENDED to a the value taxonID for an unambiguously resolved single taxon record in the specified source authority service through (1) if the value of dwc:scientificName and dwc:cultivarEpithet, or (2) or if dwc:scientificName is EMPTY through values of the terms dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet, dwc:taxonRank, dwc:scientificNameAuthorship (and if not EMPTY, dwc:cultivarEpithet), or (2) if ambiguity produced by multiple matches in (1) or (2) can be disabmiguated to a single Taxon using the values of dwc:subgenus, dwc:genus, dwc:family, dwc:order, dwc:class, dwc:phylum, dwc:kingdom, dwc:higherClassification, dwc:scientificNameID, dwc:acceptedNameUsageID, dwc:originalNameUsageID, dwc:taxonConceptID, and dwc:vernacularName); otherwise NOT_AMENDED
Key piece (from #70) to add to the notes: The terms dwc:subgenus, dwc:genus, dwc:family, dwc:order, dwc:class, dwc:phylum, dwc:kingdom, dwc:higherClassification, dwc:scientificNameID,, dwc:acceptedNameUsageID, dwc:originalNameUsageID, dwc:taxonConceptID should not be used to make a match if dwc:taxonId and dwc;scientificName or dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet, dwc:taxonRank, dwc:scientificNameAuthorship are empty.
This expresses the assertion that if only dwc:genus is populated, the taxonID for that genus should not be filled in as the amendment, as the dwc:genus (and dwc:family and up) is a classification term for the Taxon, not necessarily a constituent part of the name of the Taxon.
I don't think this is correct and I find this wording totally confusing
I am not familiar enough with TAXONIDs but if you only have a Family - doesn't that have a TAXONID? When I previously worded #57, I assumed that all names in the hierarchy would have a TAXONID - Am I wrong?
dwc:cultivarEpithet shouldn't be treated any differently to dwc:infraspecificEPithet so why would it be in 1)?
I can't get my head around the rest - sorry. Needs another attempt
See the thinking behind my current version under my comment of 5 days ago.
The definiton for dwc:family is "The full scientific name of the family in which the taxon is classified." The set of higher taxonomy terms from dwc:genus on up (and this is why dwc:genericName "The genus part of the scientificName without authorship." was needed distinct from dwc:genus), are classifiers for the taxon, not the taxon. If only dwc:family is supplied we have no idea which taxon within that family is being referenced. This is distinctly different from the case where dwc:family and dwc:scientificName contain the same value. Here is it unambigously clear that the Taxon in question is the Family, but when Family is populated, and scientificName and taxonId are empty, we have no way of telling which taxon is being referenced, it might be the Family, or it might be any taxon that can be placed within that Family.
Yes, totally confusing....
On Fri, 11 Mar 2022 13:23:33 -0800 Arthur Chapman @.***> wrote:
I am not familiar enough with TAXONIDs but if you only have a Family
- doesn't that have a TAXONID? When I previously worded #57, I assumed that all names in the hierarchy would have a TAXONID - Am I wrong?
Interesting - In the Botanical Code a taxon is described as "Taxonomic groups at any rank will, in this Code, be referred to as taxa (singular: taxon)."
Darwin Core Definition is "A group of organisms (sensu http://purl.obolibrary.org/obo/OBI_0100026) considered by taxonomists to form a homogeneous unit."
As a taxonomist - I have always regarded a Family name as a taxonomic name at the rank of Family. If we are using the term taxon in a different way - then we need to be clear and define it differently
That's exactly it, in Darwin core, dwc:family is already defined differently. In that context it means the family into which the dwc:Taxon record is currently classified. As I read that definition, a dwc;Taxon record where only the dwc:family is populated is one for which you do not know what the Taxon is. Again, the case where both dwc:scientificName or dwc:taxonID contain values and dwc:family contains a value is totally different, even if the value in dwc:scientificName is identical to that in the dwc;family, in that case the dwc:Taxon record is for the family, and that botanical code definition applies. When only dwc:family is populated, it is a reference to a Taxon, but as dwc:family, it is only a reference to that Taxon, not an unambiguous indicator of what taxon the current dwc:Taxon instance is referring to. The issue isn't in the definition of Taxon, that is clear. The issue is that dwc:family is an explicit reference to another Taxon into which the current dwc:Taxon record is classified. Consider:
9fd1e0b1-5e62-49eb-8bdd-4aee2c58e568 is a dwc:Taxon 9fd1e0b1-5e62-49eb-8bdd-4aee2c58e568 has dwc:Family Muricidae Rafinesque, 1815
ea6ff6cf-c21e-476f-854c-1c8cf4a3cd74 is a dwc:Taxon ea6ff6cf-c21e-476f-854c-1c8cf4a3cd74 has dwc:Family Muricidae Rafinesque, 1815 ea6ff6cf-c21e-476f-854c-1c8cf4a3cd74 has dwc:scientificName Muricidae Rafinesque, 1815 ea6ff6cf-c21e-476f-854c-1c8cf4a3cd74 has dwc:taxonID urn:lsid:marinespecies.org:taxname:148
4ecedbbd-6524-4b6f-a2f7-d3b61eda252a is a dwc:Taxon 4ecedbbd-6524-4b6f-a2f7-d3b61eda252a has dwc:Family Muricidae Rafinesque, 1815 4ecedbbd-6524-4b6f-a2f7-d3b61eda252a has dwc:scientificName Chicoreus brevifrons (Lamarck, 1822) 4ecedbbd-6524-4b6f-a2f7-d3b61eda252a has dwc:taxonID urn:lsid:marinespecies.org:taxname:558803
Can you tell from the information given what either the dwc:scientificName or dwc:taxonID of 9fd1e0b1-5e62-49eb-8bdd-4aee2c58e568 is from the information given? I can't. There is a dwc:Taxon in this data set for Muricidae, but it is 4ecedbbd-6524-4b6f-a2f7-d3b61eda252a, and from the information given, it isn't possible to tell if 9fd1e0b1-5e62-49eb-8bdd-4aee2c58e568 is the same taxon as 4ecedbbd-6524-4b6f-a2f7-d3b61eda252a.
OK - but still, if you only have a record with a Family i.e. Muricidae Rafinesque, 1815 and no information below - i.e. I have only been able to identify this taxon to Family - shouldn't we then add the Taxon ID if the sourceAuthority provides a TAXONID for a family name [I am not familiar enough to know if they do - if they don't then OK - I defer to what you propose]
9fd1e0b1-5e62-49eb-8bdd-4aee2c58e568 is a dwc:Taxon 9fd1e0b1-5e62-49eb-8bdd-4aee2c58e568 has dwc:Family Muricidae Rafinesque, 1815 xxxxxx-xxxx-xxxx-xxx-xxxxxxx has dwc:taxonID urn:lsid:marinespecies.org:taxname: xxx
Is there consensus on the Expected Response?
Not yet! There are some questions still to be answered (the email I sent around on #57 and #70) - for example
on treatment of dwc:cultivarEpithet different to dwc:infraspecificEpithet (I believe they shouldn't be differently treated - @tucotuco to clarify use of dwc:cultivarEpithet) - see my comment of two days ago
and whether or not the higher categories have TAXONIDs. From my email "I am still not fully convinced re TAXONID and higher level taxa. Does the sourceAuthority (GBIF?) give a TAXONID for a family name?I am not familiar enough with TAXON ID to know. If they don't then I accept @chicoreus arguments. But if they do, and a record has only a name at the Family level with no information at a lower level (i.e. I have only been able to identify this record to Family). If the sourceAuthority gives a Taxon ID for the Family - then why would be not use that TAXONID for the record.
This is particularly relevant as the Botanical Code defines a taxa as "Taxonomic groups at any rank will, in this Code, be referred to as taxa (singular: taxon)." In the Zoological Code: "A taxonomic unit, whether named or not: i.e. a population, or group of populations of organisms which are usually inferred to be phylogenetically related and which have characters in common which differentiate (q.v.) the unit (e.g. a geographic population, a genus, a family, an order) from other such units. A taxon encompasses all included taxa of lower rank (q.v.) and individual organisms. The Code fully regulates the names of taxa only between and including the ranks of superfamily and subspecies" The Zoological Code treats a family name as a taxon
"family name or name of a family A scientific name of a taxon at the rank of family."
Darwin Core definition "A group of organisms (sensu http://purl.obolibrary.org/obo/OBI_0100026) considered by taxonomists to form a homogeneous unit." It gives the example of "The genus Truncorotaloides as published by Brönnimann et al. in 1953 in the Journal of Paleontology Vol. 27(6) p. 817-820."
Not yet! There are some questions still to be answered (the email I sent around on #57 and #70) - for example
- on treatment of dwc:cultivarEpithet different to dwc:infraspecificEpithet (I believe they shouldn't be differently treated - @tucotuco to clarify use of dwc:cultivarEpithet) - see my comment of two days ago
My understanding is that a cultivarEpithet should be as determinant of a Taxon as an infraspecificEpithet is and treated in the same way.
- and whether or not the higher categories have TAXONIDs. From my email "I am still not fully convinced re TAXONID and higher level taxa. Does the sourceAuthority (GBIF?) give a TAXONID for a family name?I am not familiar enough with TAXON ID to know. If they don't then I accept @chicoreus arguments. But if they do, and a record has only a name at the Family level with no information at a lower level (i.e. I have only been able to identify this record to Family). If the sourceAuthority gives a Taxon ID for the Family - then why would be not use that TAXONID for the record. This is particularly relevant as the Botanical Code defines a taxa as "Taxonomic groups at any rank will, in this Code, be referred to as taxa (singular: taxon)." In the Zoological Code: "A taxonomic unit, whether named or not: i.e. a population, or group of populations of organisms which are usually inferred to be phylogenetically related and which have characters in common which differentiate (q.v.) the unit (e.g. a geographic population, a genus, a family, an order) from other such units. A taxon encompasses all included taxa of lower rank (q.v.) and individual organisms. The Code fully regulates the names of taxa only between and including the ranks of superfamily and subspecies" The Zoological Code treats a family name as a taxon "family name or name of a family A scientific name of a taxon at the rank of family." Darwin Core definition "A group of organisms (sensu http://purl.obolibrary.org/obo/OBI_0100026) considered by taxonomists to form a homogeneous unit." It gives the example of "The genus Truncorotaloides as published by Brönnimann et al. in 1953 in the Journal of Paleontology Vol. 27(6) p. 817-820."
I agree with @chicoreus about the case where dwc:family (and no lower rank) is populated and dwc:scientificName is not, for the simple fact that the Taxon is ambiguous. Specifically, it MIGHT be the family, but it might be something in the family. Probably way too subtle for most people to worry about, but I think it's correct.
OK - if we accept the rasoning of @chicoreus and @tucotuco
INTERNAL_PREREQUISITES_NOT_MET if dwc:taxonID is not EMPTY or if all of, dwc:scientificName, dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet, dwc:scientificNameAuthorship, and dwc:cultivarEpithet are EMPTY, AMENDED the value of taxonID for an unambiguously resolved single taxon record in the specified source authority service through (1) the value of dwc:scientificName or (2) if dwc:scientificName is EMPTY through values of the terms dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet, dwc:scientificNameAuthorship and dwc:cultivarEpithet), or (3) if ambiguity produced by multiple matches in (1) or (2) can be disambiguated to a single Taxon using the values of dwc:subgenus, dwc:genus, dwc:subfamily, dwc:family, dwc:order, dwc:class, dwc:phylum, dwc:kingdom, dwc:higherClassification, dwc:scientificNameID, dwc:acceptedNameUsageID, dwc:originalNameUsageID, dwc:taxonConceptID, dwc:taxonomicRank, and dwc:vernacularName); otherwise NOT_AMENDED
If accepted it appears that we can take dwc:genericName and dwc:infragenericEpithet out of Information Elements
Note:
I have to defer to @chicoreus, @ArthurChapman and @tucotuco on this. I will apply @ArthurChapman's latest Expected Response, with a few more tweaks.
Are we all happy with the specifications on this one now?
Changed "AMENDED" to "FILLED_IN" in accordance with discussions April 16.
Amended Example to align with @chicoreus comments in email 17th June 2022.
So the text of cultivarEpithet should also be found in scientificName?
On Sun, 13 Mar 2022 22:37:49 -0700 John Wieczorek @.***> wrote:
- on treatment of dwc:cultivarEpithet different to dwc:infraspecificEpithet (I believe they shouldn't be differently treated - @tucotuco to clarify use of dwc:cultivarEpithet) - see my comment of two days ago
My understanding is that a cultivarEpithet should be as determinant of a Taxon as an infraspecificEpithet is and treated in the same way.
So the text of cultivarEpithet should also be found in scientificName?
Yes, I think it should. But for a definitive answer it is best to ask someone such as @mdoering and @ nielsklazenga.
@nielsklazenga - any comments? [space inadvertently included in last post by @tucotuco
Regarding cultivarEpithet
, yes, that is part of the scientificName
string.
Why don't we have an "EXTERNAL_PREREQUISITES_NOT_MET" if we reference bdq:sourceAuthority?!
I've added it as otherwise it will stuff up the test data work.
Changed Parameter(s) to "bdq:sourceAuthority" as per discussions 12th June 2023
I have added to the Notes to be consistent with https://github.com/tdwg/bdq/issues/71:
"When referencing a GBIF taxon by GBIF's identifier for that taxon, use the the pseudo-namespace "gbif:" and the form "gbif:{integer}" as the value for dwc:taxonID."
Will need to include the new terms dwc:superfamily, dwc:tribe, dwc:subtribe https://github.com/tdwg/dwc/issues/65 https://github.com/tdwg/dwc/issues/45 https://github.com/tdwg/dwc/issues/46
Added the terms dwc:superfamily, dwc:tribe, dwc:subtribe to the Information elements and Expected response, and updated Specification Last Updated.
On this one, please check my Expected response.
Amended Source Authority values to align with @chicoreus syntax
From
bdq:sourceAuthority default = "GBIF Backbone Taxonomy" [https://doi.org/10.15468/39omei] | | | API endpoint [https://api.gbif.org/v1/species?datasetKey=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c&name=]
to
bdq:sourceAuthority default = "GBIF Backbone Taxonomy" {[https://doi.org/10.15468/39omei]} {API endpoint [https://api.gbif.org/v1/species?datasetKey=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c&name=]}
Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted". Also changed "Field" to "TestField" and "Output Type" to "TestType".