gbv / cocoda-sdk

SDK for Cocoda and coli-conc services
https://gbv.github.io/cocoda-sdk/
MIT License
5 stars 1 forks source link

Issue with different APIs requiring different scheme URIs #43

Closed stefandesu closed 2 years ago

stefandesu commented 2 years ago

This could possible belong to https://github.com/gbv/bartoc.org rather than here.

Related to https://github.com/gbv/cocoda-sdk/issues/41.

There is an issue with determining a scheme's API by using BARTOC's API field. In particular, the issue is that certain APIs require a specific scheme URI to be used for requests.

For example: If you currently run node examples/bartoc.js on the dev branch, it will say that they are 0 top concepts for BK. The API for BK is DANTE, and DANTE is not aware of bartoc.org URIs, it seems. Thus, when calling the /voc/top endpoint with the bartoc.org URI (which is in the main uri field and therefore used for the call), it returns 0 results (without an error message).

Previously, this wasn't an issue in Cocoda because Cocoda didn't use registries based on the scheme's API field (initialized by registryForScheme). But this is an issue in BARTOC, as you can see for BK: https://bartoc.org/en/node/18785#content And since we are planning to use this for Cocoda as well (see https://github.com/gbv/cocoda/issues/670), this is going to become an issue there as well.

There are multiple potential solutions:

  1. Force all APIs to support BARTOC URIs. This shouldn't be an issue for DANTE in particular, but I don't think it's realistic to enforce this for every single API.

  2. Determining the scheme's "main URI" by calling the API's endpoint for schemes and saving the uri that it returns. In theory, this is what we have done before, see ConceptApi's and SkosmosApi's _getSchemeUri method. However, when initializing a registry from the API field, we don't ever call the API's scheme endpoint and therefore never know which is the main URI. In theory, we can adjust the existing methods to ignore the field _api.schemes that is currently used and call the API's scheme endpoint regardless. Not sure if this is the best or most efficient solution though. (See more on this below.)

  3. In the data of the API field, add an additional (optional) field uri that is the main URI that this particular API is using for this scheme. This might be the easiest solution, even if it requires some editorial work.

  4. Use all available identifiers in each call where the scheme URI is needed. This would work for ConceptApi since it should support multiple URIs separated by | (this works in DANTE as well). This would be the easiest to implement and will work as long as this issue only exists for ConceptApi.

One additional thing: SkosmosApi seems to already use solution 2 and ignores the _api.schemes field as far as I can tell. I'm actually not sure if this is intended behavior, but it definitely fixes this issue. I think the reason we're using the _api.schemes field in ConceptApi is that we wanted to be able to restrict a certain registry to a subset of its schemes. However, instead of directly returning _api.schemes if it is a list of schemes, we could still call the scheme endpoint and simply filter the results with those schemes that are in the _api.schemes list. If we do this, solution 2 would probably be the best solution for this issue, in particular because the basis for this is already implemented. (I need to see if this is actually possible.)

@nichtich

stefandesu commented 2 years ago
  1. Allow setting a different "main URI" in BARTOC. Usually you would expect that the uri field contains the principal URI for that entity, and that the identifier field contains additional identifiers used by other systems. You could argue that the bartoc.org URI is on of those additional identifiers rather than the principal URI, and that the uri field, even for schemes return by BARTOC, contains the URI set by the publishers of that scheme.
nichtich commented 2 years ago

Providers should try best as can to find out which URI (uri and identifier) to use in the backend. For jskos-server and DANTE solution 4 can be applied but other APIs may need a more complex wrapper, additional calls (such as SkosmosApi) or additional information not expressible in field url of API configuration. I don't think this is urgent but best solution is to allow configuration of more fields in addition to type and url for APIs. In particular if we want to support different APIs for the same vocabulary based on different versions, languages etc.

Solution 5 also seems reasonable but requires other changes in BARTOC.

stefandesu commented 2 years ago

Okay. I will implement solution 4 (or 2, or a combination of those two) for now, but long-term we would want to use 3 or 5.

stefandesu commented 2 years ago

While I was fixing this issue I realized that even in ConceptApi, solution 2 was actually the intended behavior, so that's what I have now implemented. 👍