globalbioticinteractions / nomer

maps identifiers and names to other identifiers and names
GNU General Public License v3.0
18 stars 3 forks source link

add support for world of flora online #96

Closed jhpoelen closed 1 year ago

jhpoelen commented 2 years ago

http://www.worldfloraonline.org/

jhpoelen commented 2 years ago

suggested by JT and @seltmann as most commonly preferred plant catalog.

jhpoelen commented 2 years ago

Some summary statistics of World of Fauna Online look promising - especially because the licensing is CC0.

$ wget http://104.198.143.165/files/WFO_Backbone/_WFOCompleteBackbone/WFO_Backbone.zip
$ unzip -p WFO_Backbone.zip classification.txt | mlr --tsvlite cut -f taxonomicStatus | sort | uniq -c | sort -nr
 827578 SYNONYM
 444571 ACCEPTED
 112380 UNCHECKED
  40473 DOUBTFUL
     55 HETEROTYPICSYNONYM
      4 HOMOTYPICSYNONYM
      1 taxonomicStatus
jhpoelen commented 2 years ago

hey @Daniel-Mietchen - I am trying to add support in GloBI for https://www.wikidata.org/wiki/Property:P7715 (World of Flora Online) for @seltmann and friends.

However, for some reason, the Datatype is not marked as an identifier for taxa like https://www.wikidata.org/wiki/Property:P815 .

Can you please help to turn https://www.wikidata.org/wiki/Property:P7715 into an instance of https://www.wikidata.org/wiki/Q42396390 ?

For some reason, I cannot edit this property.

jhpoelen commented 2 years ago

Nomer v0.2.14 https://github.com/globalbioticinteractions/nomer/releases/tag/0.2.14 is now able to do things like:

echo -e "\tQuercus" | nomer append wfo
[main] INFO org.globalbioticinteractions.nomer.match.TermMatcherRegistry - using matcher [wfo]
[main] INFO org.globalbioticinteractions.nomer.match.WorldOfFloraOnlineTaxonService - [WORLD_OF_FLORA_ONLINE] taxonomy already indexed at [/media/jorrit/branta/nomer/world_of_flora_online/world_of_flora_online], no need to import.
    Quercus HAS_ACCEPTED_NAME   WFO:4000032377  Quercus genus       Fagales | Fagaceae | Quercus    WFO:9000000208 | WFO:7000000231 | WFO:4000032377    order | family | genus  http://www.worldfloraonline.org/taxon/wfo-4000032377    

also,

$ nomer ls wfo | head -n2
[main] INFO org.globalbioticinteractions.nomer.match.TermMatcherRegistry - using matcher [wfo]
[main] INFO org.globalbioticinteractions.nomer.match.WorldOfFloraOnlineTaxonService - [WORLD_OF_FLORA_ONLINE] taxonomy already indexed at [/media/jorrit/branta/nomer/world_of_flora_online/world_of_flora_online], no need to import.
WFO:0000000001  Cirsium caput-medusae   SYNONYM_OF  WFO:0000027702  Cirsium spinosissimum   species     Asterales | Asteraceae | Cirsium | Cirsium spinosissimum    WFO:9000000038 | WFO:7000000146 | WFO:4000008373 | WFO:0000027702   order | family | genus | species    http://www.worldfloraonline.org/taxon/wfo-0000027702    
WFO:0000000002  Aster persaliens    HAS_ACCEPTED_NAME   WFO:0000000002  Aster persaliens    species     Asterales | Asteraceae | Aster | Aster persaliens   WFO:9000000038 | WFO:7000000146 | WFO:4000003381 | WFO:0000000002   order | family | genus | species    http://www.worldfloraonline.org/taxon/wfo-0000000002    

@seltmann please confirm that the behavior is as expected.

jhpoelen commented 1 year ago

@jtmiller28 @seltmann initial support for world of flora online has been in use for a while. If you have additional feature requests for WFO, or find suspicious behavior, please open a new issue.