gbif / portal16

GBIF.org website
https://www.gbif.org
Apache License 2.0
24 stars 15 forks source link

Matched name in species lookup - why not valid name suggested? #1848

Open gbif-portal opened 1 year ago

gbif-portal commented 1 year ago

Matched name in species lookup - why not valid name suggested?

When you click a pencil icon, options are offered, but NOT a valid name of a synonym, why?


Github user: @dschigel User: See in registry - Send email System: Chrome 112.0.0 / Windows 10.0.0 Referer: https://www.gbif.org/tools/species-lookup Window size: width 1536 - height 758 API log&_a=(columns:!(_source),filters:!(),index:'3390a910-fcda-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) Site log&_a=(columns:!(_source),filters:!(),index:'5c73f360-fce3-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) System health at time of feedback: OPERATIONAL

MortenHofft commented 1 year ago

The tool is simply a visual decoration for the species match API. Sort of an easy way to get started with the API. But both the tool and the API might well be improved.

Is it that if you provide a synonym name, then the tool will not suggest the accepted name? That is simply because that isn't how the match API works. Occurrence is similarly indexed with the name they matched to, synonym or not. And it isn't clear to me that it is always what the user would want either, to have their names rewritten. But it could be an option: "Snap to accepted names?" checkbox. Would that solve it for you? I guess it makes sense if you are trying to clean up names.

dschigel commented 1 year ago

BombusRuined.csv Try this file I "spoiled" for the course purposes. E.g. Apis hortorum is a synonym of a valid name Bombus hortorum. The tool detects that Apis hortorum is a syn, but it does not suggest Bombus hortorum as the first name to edit to when you click a pencil icon - it should

dschigel commented 1 year ago

Students also try with own lists (not all 19 of them) but this is clearly a feature in demand. In a way, it enforces a false image of GBIF as taxonomic / nomenclatural authority, but it is too late to worry about: 2023 conference and courses made me realize that most GBIF users come for a quick check (of names, distributions) compared to users aiming at analyses and DOI citation. It would be interesting to see if web traffic confirms this impression from conferences and courses. I also suspect we also never see this army of hit and go users in helpdesk, blog, etc. #nosuchthingastypicaluser

dschigel commented 1 year ago

Maybe important - the download from which the "ruined" example was made of was a Species list download - not a simple nor DwC-a occurrence download. The course task was "let's see how many Bombus species are known (to GBIF) to occcur in Bulgaria, and which"

ManonGros commented 1 year ago

@dschigel in case your students are interested in how the name matching works on GBIF: https://data-blog.gbif.org/post/gbif-species-api/

MortenHofft commented 1 year ago

This is the API call https://api.gbif.org/v1/species/suggest?datasetKey=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c&limit=10&q=Apis+hortorum

This is the response

[
  {
    "key": 4496004,
    "nameKey": 837797,
    "kingdom": "Animalia",
    "phylum": "Arthropoda",
    "order": "Hymenoptera",
    "family": "Apidae",
    "genus": "Bombus",
    "species": "Bombus hortorum",
    "kingdomKey": 1,
    "phylumKey": 54,
    "classKey": 216,
    "orderKey": 1457,
    "familyKey": 4334,
    "genusKey": 1340278,
    "speciesKey": 1340542,
    "parent": "Bombus",
    "parentKey": 1340278,
    "nubKey": 4496004,
    "scientificName": "Apis hortorum Linnaeus, 1761",
    "canonicalName": "Apis hortorum",
    "rank": "SPECIES",
    "status": "HOMOTYPIC_SYNONYM",
    "synonym": true,
    "higherClassificationMap": {
      "1": "Animalia",
      "54": "Arthropoda",
      "216": "Insecta",
      "1457": "Hymenoptera",
      "4334": "Apidae",
      "1340278": "Bombus",
      "1340542": "Bombus hortorum"
    },
    "class": "Insecta"
  }
]

It isn't a feature of the API to suggest accepted names. If we believe it is a general reasonable feature, the API would be the natural place to change it. Alternatively the UI could try to resolve the names in the browser. There could be multiple synonyms in the response that would all have to be resolved, and then added to the suggestions. But made distinct as 2 suggestions could have the same accepted name. And then the sorting would have to be reshuffled somehow. I'm not sure what approach I would prefer. Alternatively a graphql proxy suggest.

Doing it in the client seems like a flawed approach. Especially as we use suggest so many places. And it requires more network traffic, but we want suggest to be fast.

So I would be in favour of an API change of a proxy suggest endpoint that enriched the results

dschigel commented 1 year ago

Thanks! Maybe suggesting a valid name is not a widespread need, but it was very dominant in the course just finished. In my opinion, if the valid name is not suggested, it should be at least listed as one of the many variants the system offers when pencil icon is clicked. In the example liked above, the name Apis hortorum Linnaeus, 1761 is correctly identified as a synonym, but if you click the pencil, you don't see Bombus hortorum (Linnaeus, 1761), which removes a lot of potential value in this nice tool. People with domain knowledge will know what to do, but people coming for a quick check of a name, students in GBIF course will not see a valid name as a single or as a one of the many suggestions. Thanks also for API implementation tips - in case my justification is accepted, I am sure these will be needed for implementation!