Open myrmoteras opened 2 months ago
I can easily put the query for nanotyrannus into the advanced tab, but I'd prefer to make it a bit more generally useful first.
There is no query for all synonyms yet, for Tyrannosaurus I just manually run the query for all synonyms and removed all entries without conflicts by hand
Here is a more general query:
################################################################################
# #
# Note: This query ONLY works with the treatment.ld.plazi.org sparql endpoint! #
# #
################################################################################
PREFIX dwc: <http://rs.tdwg.org/dwc/terms/>
PREFIX dwcFP: <http://filteredpush.org/ontologies/oa/dwcFP#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX treat: <http://plazi.org/vocab/treatment#>
SELECT DISTINCT ("--" AS ?simple) ?name ?authority (GROUP_CONCAT(?treatment; separator=",") AS ?treatments) WHERE {
?tc treat:hasTaxonName ?name .
?tc dwc:scientificNameAuthorship ?authority1 .
GRAPH ?treatment {
?tc dwc:scientificNameAuthorship ?authority .
}
FILTER(?authority1 != ?authority)
}
GROUP BY ?name ?authority
ORDER BY ?name
LIMIT 100
Running a similar query reveals there to be 26467 names with multiple authorities in the data, so manual fixup would be quite the effort
For a given taxon name, the follwowing lists all treatments for it and their authority and some useful metadata to help in deciding which one is correct:
################################################################################
# #
# Note: This query ONLY works with the treatment.ld.plazi.org sparql endpoint! #
# #
################################################################################
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dwc: <http://rs.tdwg.org/dwc/terms/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX treat: <http://plazi.org/vocab/treatment#>
SELECT DISTINCT ?name ?authority ?year ?treatment ?authors ?title WHERE {
# Replace Name here as relevant
BIND(<http://taxon-name.plazi.org/id/Animalia/Laelaps_incrassatus> AS ?name)
?tc treat:hasTaxonName ?name .
GRAPH ?treatment {
?tc dwc:scientificNameAuthorship ?authority .
}
BIND(IRI(REPLACE(STR(?treatment), "https", "http")) AS ?treatment_http)
?treatment_http dc:creator ?authors ;
dc:title ?title ;
treat:publishedIn/dc:date ?year .
}
ORDER BY ?year
For example, for Laelaps incrassatus, it gives which to me indicates that the latter two treatments are probably wrong and should be fixed with
A quick glance at the list provided by the first query above shows that most "disagreements" are (Name, 1234)
vs Name, 1234
(i.e. Name
as baseAuthority
vs as authority
).
These cannot be fixed easily "after-the-fact" and require a human to check if it is supposed to be base- or non-base-authority.
However, i have found a handful of cases that could be "fixed" as such:
&
vs ,
to separate NamesA. Name
or A. A. Name
should be normalized to Name
)L.
→ Linnaeus
In other cases, some variants are redundant shorter versions of others, so these could be hidden in Synospecies by hiding some names (and putting them into a small (i)-popup with a "Authority also given as:" notice):
@nleanba can you please add the sparql query to find out conflicting authorities into the canned queries in "advanced". See the one you did on Trex