opengeospatial / ogcapi-records

An open standard for the discovery of geospatial resources on the Web.
https://ogcapi.ogc.org/records
Other
59 stars 28 forks source link

support distributed search capability #118

Open tomkralidis opened 3 years ago

tomkralidis commented 3 years ago

Some infrastructures have a metadata catalogue architecture of 1..n distributed endpoints that may harvest or aggregate from one another. While harvesting would be put forth as an extension (see #48), the ability to do a run-time search of a federation of catalogues (who have not harvested/aggregated one another) is also a valuable use case.

Options:

cc @pvretano @cportele @uvoges

mhogeweg commented 3 years ago

I can see it being useful to have some standardization around federated search. We in fact have this in our Geoportal Search Component where we support a vendor-specific parameter to indicate pre-configured sources to search. For example: search ArcGIS Online and Geoportal Server for 'map'.

What needs to be clear in any specification is:

pvretano commented 1 year ago

14-NOV-2022:

Federated search would definitely be an extension ... although I suppose a client, knowing how to query one OAPIR server could query across a bunch of OAPIR servers and then aggregate the results.

We need to be clear about what we mean by harvesting. Three types of harvesting were identified in the SWG call:

Another wrinkle here is that the search API for records is really the features API which does not currently include a federated (or cross collection, cross deployment) search capability. This might be another case where we define the functionality here in Records but it eventually gets moved over to Features.

kalxas commented 1 year ago

The above 3 types of harvesting cover the "offline" mode of distributed search, i.e. the local catalogue has already done queries to the remote catalogue and has stored the results/records in the local database/model.

We also need to define the "online" mode of distributed search (a.k.a. federated search) that was previously defined in CSW 2 and 3, i.e. the local catalogue is doing live queries to the remote catalogue(s) and presents the results/records without storing them locally. See http://docs.opengeospatial.org/is/12-168r6/12-168r6.html#58 and http://docs.opengeospatial.org/is/12-176r7/12-176r7.html#85

In this case we need to describe things like, how the records can be grouped/aggregated, how is a list of federated catalogues retrieved etc. For example:

There is interest in the EO domain for the online/federated search case because EO catalogues include millions of products/records and are less easy to harvest/maintain in "offline" mode.

tomkralidis commented 1 year ago

I think this capability should not be part of core, but as a conformance class or an extension. Thoughts @pvretano @kalxas @mhogeweg ?

pvretano commented 1 year ago

@tomkralidis as I mention above, federated or distributed search would definitely be an extension as harvesting would. So, I agree with you.

tomkralidis commented 1 year ago

OK. Perhaps we should move the "Extensions" column out of the "Part 1: Core" project?

tomkralidis commented 10 months ago

2023-11-01 OGC API code sprint: