gbif / hp-land

2 stars 0 forks source link

Search for vernacular names? #5

Closed jholetschek closed 1 year ago

jholetschek commented 1 year ago

Since our portal will be used by many citizen scientists, we're wondering whether it's possible to include a search filter based on vernacular names? The GBIF backbone has quite a lot of common names, so using them in a filter would be great!

Apparently gbif.org doesn't have such a filter for the occurrence search, only for species: https://www.gbif.org/species/search?q=windr%C3%B6schen

MortenHofft commented 1 year ago

It makes perfect sense. But there isn't really have a good/simple way to do it. You kind of have to build it yourself.

If you have a checklist that you would like to use, then you could search that using something like

https://api.gbif.org/v1/species/search?q=windr%C3%B6schen&qField=VERNACULAR&datasetKey=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c

And then find the backbone entry from that. And then use that taxonKey as the filter.

MortenHofft commented 1 year ago

I remember that https://github.com/VtEcostudies/VAL_GBIF_Wordpress/issues/12 and https://vcevaldev.wpengine.com/ so something related

Let me think about it a bit more. Perhaps I can come up with an easier to use option. Your group is hardly the only one.

Do you have a checklist with vernacular names that you would like to use? The problem with vernacular names is that they can vary a lot, even within the same language. And we have vernacular names that others disagree with (or might even be factually wrong). So simply relying on the concatenated version in the backbone might be unsatisfactory.

jholetschek commented 1 year ago

So to make sure I understand right: If we do that on our own (like the Vermont Atlas), it wouldn't appear within the occurrence widget as a regular filter, right?

The question on the checklist is something for @T-Engel . As far as I know, for now the GBIF backbone would be sufficient.

T-Engel commented 1 year ago

Thanks, Morten. Yes, I think the GBIF backbone is a good starting point for our purpose because we have a rather broad taxonomic scope. I agree that the common names may cause some issues with synonyms and so on but nonetheless I think it's a useful addition- and it would not replace the search for scientific name. Basically, we would love to have the GBIF species search integrated in the occurrence search of the portal.

MortenHofft commented 1 year ago

So to make sure I understand right: If we do that on our own (like the Vermont Atlas), it wouldn't appear within the occurrence widget as a regular filter, right?

All the suggest endpoints can be overwritten. So if you had an endpoint that allowed search by vernacular names and matched those to GBIF taxonKeys then that would work and simply replace the existing suggest in the filter. Example from the Vermont group where the "administrative area" filter suggest is overwritten live example: https://vcevaldev.wpengine.com/gbif-explorer/ code: https://github.com/VtEcostudies/VAL_GBIF_Wordpress/blob/main/js/gbif_data_widget.js#L39 and https://github.com/VtEcostudies/VAL_GBIF_Wordpress/blob/main/js/gbif_data_widget.js#L65

To make it easier I've added a new experimental suggest endpoint in the graphql service that perhaps do what you need

Also to make it simpler I've added a suggest for it

So now it can be used by specifying that the filter should use the new suggest endpoint.

I've added the config in https://github.com/gbif/hp-land/commit/dd01ee6b412d3e0ab1538b9c3a8fe598c587d3d3 so you can evaluate if it is what you imagined. It extends the existing taxon filter to include common names

Screenshot 2023-02-08 at 17 00 31
T-Engel commented 1 year ago

It works beautifully! Thanks so much, @MortenHofft. We'll let you know if we get some unexpected behaviour in future.

Thanks also for the insights into the administrative area filter of the Vermont people. @jholetschek I think it would make a lot of sense for us to do something similar, restricting the suggestions to Germany. What do you think?

In addition to the two places in the code that you pointed out, Morten, we would also need this line to specify Germany, right? https://github.com/VtEcostudies/VAL_GBIF_Wordpress/blob/85b4a919e1dd203018476697c5316dc455438512/js/gbif_data_config.js#L68

MortenHofft commented 1 year ago

That file is just because they split the gadm ID into a seperate file. It coudl just as well have been entered directly in https://github.com/VtEcostudies/VAL_GBIF_Wordpress/blob/main/js/gbif_data_widget.js#L44

So for you it would be DEU

function getSuggestions({ client }) {
  return {
    gadmGid: {
      getSuggestions: ({ q }) => {
        const { promise, cancel } = client.v1Get(`/geocode/gadm/search?gadmGid=DEU&limit=100&q=${q}`); // this gadmGid=DEU is the new part, that means that the suggester will now only suggest things in Germany
        return {
          promise: promise.then(response => {
            return {
              data: response.data.results.map(x => ({ title: x.name, key: x.id, ...x }))
            }
          }),
          cancel
        }
      }
    }
  }
}
MortenHofft commented 1 year ago

I've added the restricted suggestions for gadm https://github.com/gbif/hp-land/commit/1ba22405310fffff4718ab5e0f0eeb455e564567. It should be easy to remove if you do not like it. Let me know if you would rather add these things yourself instead of me pushing changes.

jholetschek commented 1 year ago

Thanks, Morten! That's fine - that's what staging is for :)

One question on the suggestions for the common names: How would we allow a pattern search like in the other filter? So that the first suggestions show up after typing "windr", for example?

T-Engel commented 1 year ago

Yes, thanks going ahead with the gadm. The filter for Germany seems to work, however now it only suggests the 16 federal states and no smaller entities like counties or cities (it used to do that). Is that something you could include for us as well?

MortenHofft commented 1 year ago

@T-Engel

Yes, thanks going ahead with the gadm. The filter for Germany seems to work, however now it only suggests the 16 federal states and no smaller entities like counties or cities (it used to do that). Is that something you could include for us as well?

That part should be the exact same as before.

Before it did https://api.gbif.org/v1/geocode/gadm/search?limit=100&q=ber now it adds gadmGid=DEU : https://api.gbif.org/v1/geocode/gadm/search?gadmGid=DEU&limit=100&q=ber

If it does not do what you expect it to, then could you please provide an example that I can check. It sounds like an API or content issue.

MortenHofft commented 1 year ago

@jholetschek

One question on the suggestions for the common names: How would we allow a pattern search like in the other filter? So that the first suggestions show up after typing "windr", for example?

Yes, I was curious if anyone noticed. That isn't possible with this API. It is simply using the species/search endpoint that also search vernacular names. The species/suggest endpoint is fast and supports autocompletion. Search is slower and does not handle partial words well. You will have to choose or provide a suggest service by other means. You can overwrite the suggest endpoint just like we did with GADM above.

I've created an issue with a suggestion to improve the API for search/suggest for vernacular names

T-Engel commented 1 year ago

If it does not do what you expect it to, then could you please provide an example that I can check. It sounds like an API or content issue.

Sure. Try searching for "Leipzig". Currently, we don't get suggestions for that. The screenshot below is from the virtual herbarium. It should look like that.

image

MortenHofft commented 1 year ago

Thanks - I cannot see anything in the screenshot, but the example text works (as an example, the results are wrong) https://api.gbif.org/v1/geocode/gadm/search?gadmGid=DEU&limit=100&q=Leipzig vs https://api.gbif.org/v1/geocode/gadm/search?limit=100&q=Leipzig

T-Engel commented 1 year ago

Sorry for the small image. here comes a bigger one:

image

When I do the same for the LAND portal I get no suggestions.

MortenHofft commented 1 year ago

Thanks @T-Engel I've created an issue for it here https://github.com/gbif/geocode/issues/19 there seem to be a bug in the API

T-Engel commented 1 year ago

Great. Thanks!

T-Engel commented 1 year ago

@jholetschek, I guess we should make a decision regarding the administrative areas. Until the bug is fixed, I think it would be better to remove the Germany filter. I prefer keeping the smaller administrave areas, even if that means we get suggestions outside germany. What do you think?

jholetschek commented 1 year ago

Agreed, I'd also remove the filter on Germany.

T-Engel commented 1 year ago

Done.