gnames / gnverifier

GNverifier verifies scientific names against more than 100 biodiversity databases
https://verifier.globalnames.org
MIT License
20 stars 1 forks source link

Problems with WCVP #120

Closed dimus closed 1 month ago

dimus commented 1 month ago

Thanks a lot for uploading the latest data. We tested the api against the POWO data and experienced some issues. We sent the name "Sinapidendron angustifolium" and these are the the issues we experienced:

The outlink field is empty
All current[...] fields are empty.
It classified it as a synonym even though it is accepted in the csv.
The classificationRanks and classificationPaths are missing lower levels, for instance species.

The outlink is located in https://sftp.kew.org/pub/data-repositories/WCVP/wcvp_dwca.zip/wcvp_taxon.csv as references and the wcvp_taxon.csv seems to contain more complete information than wcvp_names.csv. I don't know what could have happened to the other fields and why they are not populated.

We tested running against PWO as well and compared the response. I attached two text files with the data. Please let me know if you want it in plain text here in the chat instead. Let me know if there is anything I can do to help 🙏

dimus commented 1 month ago

I am getting there:

[
  {
    "id": "3e49c219-f10f-5740-a4bd-bfb0fbbc74b2",
    "name": "Sinapidendron angustifolium",
    "cardinality": 2,
    "matchType": "Exact",
    "bestResult": {
      "dataSourceId": 197,
      "dataSourceTitleShort": "World Checklist of Vascular Plants",
      "curation": "Curated",
      "recordId": "2476356",
      "localId": "288938-1",
      "entryDate": "2024-10-07",
      "sortScore": 9.414262843947126,
      "matchedNameID": "f2a8e16f-9c85-5f18-a09b-f823069010db",
      "matchedName": "Sinapidendron angustifolium (DC.) Lowe",
      "matchedCardinality": 2,
      "matchedCanonicalSimple": "Sinapidendron angustifolium",
      "matchedCanonicalFull": "Sinapidendron angustifolium",
      "currentRecordId": "2476356",
      "currentNameId": "f2a8e16f-9c85-5f18-a09b-f823069010db",
      "currentName": "Sinapidendron angustifolium (DC.) Lowe",
      "currentCardinality": 2,
      "currentCanonicalSimple": "Sinapidendron angustifolium",
      "currentCanonicalFull": "Sinapidendron angustifolium",
      "taxonomicStatus": "Accepted",
      "isSynonym": false,
      "classificationPath": "Plantae|Tracheophyta|Brassicaceae|Sinapidendron",
      "classificationRanks": "domain|phylum|family|genus",
      "editDistance": 0,
      "stemEditDistance": 0,
      "matchType": "Exact",
      "scoreDetails": {
        "cardinalityScore": 1,
        "infraSpecificRankScore": 0,
        "fuzzyLessScore": 1,
        "curatedDataScore": 0.6666667,
        "authorMatchScore": 0.2857143,
        "acceptedNameScore": 1,
        "parsingQualityScore": 1
      }
    },
    "dataSourcesNum": 1,
    "dataSourcesIds": [
      197
    ],
    "curation": "Curated"
  }
]

Outlink still does not work, but it is the problem for all data-sources, so I am going to close this ticket, and open a new one for outlinks.

Please feel free to reopen if I am still missing something!