gbif / hosted-portals

Support material for establishing the GBIF Hosted Portals
Apache License 2.0
10 stars 6 forks source link

Filters do not limit to portal scope #187

Open albenson-usgs opened 2 years ago

albenson-usgs commented 2 years ago

I think this may be related to #158 but a different use case. I added datasetName to the filter options and I'm expecting to only be able to select datasets that are scoped to the United States but it is searching all datasets in GBIF. Maybe it is not possible to scope the options to the portal but I think this would be what users are expecting.

Capture
MortenHofft commented 2 years ago

yep, that isn't how it works. It isn't how it works on GBIF.org either. It shows all possible datasets. including datasets that have no occurrences. I agree it would be nice though.

I recently had dataset titles added to the occurrence index for that reason. That it was a requirement if we ever wanted to do exactly this.

NB: this "problem" also exsists for many other filters. E.g. country

MortenHofft commented 2 years ago

Other filters that behave this way

For some reason it feels more "wrong" in some cases than others.

tucotuco commented 2 years ago

@MortenHofft Do you anticipate that it is intractable to implement portal-level scope filters? I think behavior consistency would be important. I also think the user experience would be less confusing if they were in place.

MortenHofft commented 2 years ago

I think it is doable, but it requires a fair amount of work for some of them I believe. And some of it would require changes to our index (which isn't something I can do).

I do fully agree that it would be nice for some filters (like dataset) and have been wondering from before writing my first line of code when someone would request it.

But I'm not convinced that it should be done for everything. In https://github.com/gbif/hosted-portals/issues/158#issuecomment-843648832 you also argued that we shouldn't do it for scientific names for example.

And it feels like it could be complicated to do well. Just take a simple basic filter like basisOfRecord. Currently it gets the options from a list of enumerations like https://api.gbif.org/v1/enumeration/basic/BasisOfRecord (except in code and shipped with the library to avoid doing 30 calls to get live enumerations)

That would have to be changed to a dynamic loaded version that checked the possible values in the index for a given data scope (only these 4 basisOfRecord values exists in VertNet). But that can change in 2 seconds when new data is published. And so filter options are suddenly a dynamic size. We then get issues with performance and caching. And currently it isn't even async (for performance reasons), so to add that and even do so from a somewhat slow endpoint that cannot really be cached feels like trouble.

I'm sure we can find decent solutions, but it isn't a simple fix

MortenHofft commented 2 years ago

dataset and publisher search is now limited to the current data scope