gbif / portal-feedback

User feedback for the GBIF API, website and published data. You can ask questions here. 🗨❓
30 stars 16 forks source link

Keywords for GRIIS datasets not being indexed properly #2479

Open MortenHofft opened 4 years ago

MortenHofft commented 4 years ago

@dschigel have noticed that the links to invasive alien species no longer appears on country pages. It could be a data content issue, but it looks to me like an indexing issue.

GRIIS datasets for specific countries are searchable by keywords. Search for Belgium keyword=country_BE works as expected. But many other countries no longer works. E.g. Spain keyword=country_ES. This is despite that there is a dataset for Spain with the correct keyword.

Perhaps this has to do with moving dataset search to ES @fmendezh?

dschigel commented 4 years ago

Thanks for looking at this. Please note that one country may have more than one GRIIS checklist as list for overseas territories are separate from the mainland list. My suggestion in such cases the tab, if restored, resolved to the page that lists these datasets for each country, agree @timhirsch?

MortenHofft commented 4 years ago

That is already partly the behaviour. If there is more than one result for a search for say country_ES then it links to dataset search instead (with those results). But if it is using country_ES_SOME_TERRITORY, then it won't be picked up by the API. This is the compromise we had to accept when using keywords instead of a more structures approach.

MattBlissett commented 4 years ago

The registry search index is still in SOLR, but it's possible something is out-of-sync. I'll investigate recreating the index.

MattBlissett commented 4 years ago

I'll reindex the registry, registry search will show partial results for about 10 minutes.

Please note that one country may have more than one GRIIS checklist as list for overseas territories are separate from the mainland list.

We now run the IPT for GRIIS, so if something in the EML or whatever is limiting what we can do, it's much easier for us to help / script a bulk fix.

dschigel commented 4 years ago

Thanks, good that we are in agreement on the preferred outputs for more than one IAS checklist per country. Finger crossed for the fix, the current search for e.g. Yemen -> click on alien tab ->, mainland Yemen https://www.gbif.org/dataset/06d566fc-ac40-4283-bac4-7b4198204259, but not yet e.g. Soqotra https://www.gbif.org/dataset/29d2d5a6-db22-4abd-b784-9ab2f9757c3c. It could be so that some fixes are needed at the publisher e.d. FYI GRIIS is contracted by GBIF 31 March, so we have higher chances for rapid edits before this date.

MortenHofft commented 4 years ago

@dschigel (and @timhirsch) Dmitry's example above is well known and been a natural part of the implementation all along. The original compromise is discussed here: https://github.com/gbif/portal-feedback/issues/734.

the keyword approach is a here and now solution not a scalable solution

If we need to change the behaviour to better accommodate islands/territories, then I think it might be time to rethink the original "hack".

dschigel commented 4 years ago

I am available; as discussed in the correspondence with Shyama, whatever is the implementation at the GBIF end (keywords based, ISO or something else), in the end some of the overseas territories should be shown together with the mainland checklist (e.g. Portugal + Madeira), others (with own ISO code) should not, and need to be displayed on independent country pages (e.g. Falklands separately).