Open jhpoelen opened 2 years ago
with newly added (basic) support for matching against batbase, I found:
$ nomer ls batnames | wc -l
[main] INFO org.globalbioticinteractions.nomer.match.TermMatcherRegistry - using matcher [batnames]
1456
which matches the batnames.org website.
Also, an example of a single match:
$ echo -e "\tRhinolophus sinicus" | nomer append batnames
[main] INFO org.globalbioticinteractions.nomer.match.TermMatcherRegistry - using matcher [batnames]
Rhinolophus sinicus HAS_ACCEPTED_NAME https://batnames.org/species/Rhinolophus%20sinicus Rhinolophus sinicus Chinese Rufous Horseshoe Bat @en https://batnames.org/species/Rhinolophus%20sinicus
or
$ echo -e "\tRhinolophus sinicus" | nomer append --include-header batnames | mlr --itsvlite --omd cat
providedExternalId | providedName | relationName | resolvedExternalId | resolvedName | resolvedRank | resolvedCommonNames | resolvedPath | resolvedPathIds | resolvedPathNames | resolvedExternalUrl | resolvedThumbnailUrl |
---|---|---|---|---|---|---|---|---|---|---|---|
Rhinolophus sinicus | HAS_ACCEPTED_NAME | https://batnames.org/species/Rhinolophus%20sinicus | Rhinolophus sinicus | Chinese Rufous Horseshoe Bat @en | https://batnames.org/species/Rhinolophus%20sinicus |
@ajacsherman here's the list of all indexed batnames names retrieved via
$ nomer ls --include-header batnames | mlr --itsvlite --csv cat
In attempting to align MDD with batnames using:
$ curl "https://raw.githubusercontent.com/mammaldiversity/mammaldiversity.github.io/master/_data/mdd.csv" | mlr --csv filter '$order == "CHIROPTERA"' | mlr --csv cut -f sciName | sed 's/_/ /g' | sed 's/^/\t/g' | nomer append batnames | grep NONE | head
[main] INFO org.globalbioticinteractions.nomer.match.TermMatcherRegistry - using matcher [batnames]
sciName NONE sciName
Chironax tumulus NONE Chironax tumulus
Lissonycteris angolensis NONE Lissonycteris angolensis
Coelops hirsutus NONE Coelops hirsutus
Doryrhina corynophyllus NONE Doryrhina corynophyllus
Doryrhina edwardshilli NONE Doryrhina edwardshilli
Doryrhina muscinus NONE Doryrhina muscinus
Doryrhina semoni NONE Doryrhina semoni
Doryrhina stenotis NONE Doryrhina stenotis
Doryrhina wollastoni NONE Doryrhina wollastoni
it appears that 61 names are defined in MDD that are not accepted in batnames.
via
$ curl "https://raw.githubusercontent.com/mammaldiversity/mammaldiversity.github.io/master/_data/mdd.csv" | mlr --csv filter '$order == "CHIROPTERA"' | mlr --csv cut -f sciName | sed 's/_/ /g' | sed 's/^/\t/g' | tail -n+2 | nomer append batnames | grep NONE | wc -l
61
this resulted in response by Nancy S., author of batnames to point out that the batnames integration is far from complete:
So, additional work is needed to support synonyms etc.
related to #90