gbv / cocoda-sdk

SDK for Cocoda and coli-conc services
https://gbv.github.io/cocoda-sdk/
MIT License
5 stars 1 forks source link

Support initialization of registries via BARTOC API types #34

Closed nichtich closed 3 years ago

nichtich commented 3 years ago

BARTOC collects information about vocabulary APIs with two fields for each:

cocoda-sdk should support initialization of a registry with this data. See https://github.com/gbv/bartoc.org/blob/main/vue/utils.js for current implementation in BARTOC. Maybe cdk should internally support a registry cache as well?

stefandesu commented 3 years ago

My suggestions on how to implement this:

  1. The "registries" as currently used stay exactly the same.
  2. This issue is about accessing data for a certain concept scheme, right? So how about we add an initializeScheme method that takes as an input a concept scheme with the API field and adds certain methods to that object (like this, but I now see that it's missing some important scheme-related methods). After that, you'll be able to use scheme._getTop() to query the top concepts of that scheme, for example.
  3. Change adjustSchemes in a way that if the API field is given for a certain scheme, it will call initializeScheme for it. With this, we could register the BARTOC API in cocoda-sdk once (the old way) and all methods returning a scheme will adjust it so that data is coming from the API defined in API if available.

What do you think?

stefandesu commented 3 years ago

@nichtich I've started to implement this and encountered an issue that will probably affect many of BARTOC's vocabularies that have to API field set. The issue is that some (many?) APIs require using a particular vocabulary URI. For example, for STW the default URI is http://bartoc.org/en/node/313 (of course since it's BARTOC). but the API expects http://zbw.eu/stw.

I saw that in BARTOC, you used a workaround by requesting top concepts for each possible scheme URI and then remembering which one returned a result. Unfortunately, this is neither feasible (requires multiple requests) nor useful (what if a scheme doesn't have top concepts?).

Do you have an idea on how to get around this? One idea I've yet to try is to use the getSchemes method of the same provider and then remembering the main URIs for the supported schemes. If another request is made then and we need to know which of the URIs to use, we can find the correct URI in that list. This is only feasible if API only has a limited amount of schemes though because we need all of them.

stefandesu commented 3 years ago

I submitted the first implementation in branch issue-34. The current workaround I used for the above issue is the following: Each provider for schemes (i.e. ConceptApi and SkosmosApi) now keeps track of main scheme URIs for all supported schemes. If for a certain scheme it is unknown whether it's supported, we request information about that scheme from the API. If information is available, it is saved in a "approvedSchemes" cache (so later we only need to retrieve the main scheme URI from that cache). If no information is available, it is assumed that the scheme is unsupported by the API and it is saved in a "rejectedSchemes" cache.

There is also an example usage under examples/bartoc.js which shows how easy cocoda-sdk can be used with the feature if using the BARTOC api. With minimal setup, we can request both schemes from the BARTOC API and concepts from the respected APIs for the schemes, if available.

nichtich commented 3 years ago

I've tried the example but it failed with EuroVoc because BARTOC did not include the URI http://eurovoc.europa.eu/domains used to identify BARTOC at the corresponding Skosmos instance.

Found scheme EuroVoc, it was initialized with SkosmosApi (https://bartoc-skosmos.unibas.ch/rest/v1/) Loading top concepts for EuroVoc... (node:101586) UnhandledPromiseRejectionWarning: InvalidOrMissingParameterError: Invalid or missing parameter: scheme (Missing or unsupported scheme or VOCID property on scheme)

I think the result is saved in rejectedSchemes. To make use of it applications need to catch the error and show a message or whatever is appropriate.

Looks ok except the handling of different providers in registryForScheme (https://github.com/gbv/cocoda-sdk/blob/issue-34/providers/base-provider.js#L485-L495) could be moved to the individual provider classes (ConceptApi and SkosmosApi) e.g. as method registryConfigForScheme(scheme).

stefandesu commented 3 years ago

I will look into the EuroVoc issue.

Looks ok except the handling of different providers in registryForScheme (https://github.com/gbv/cocoda-sdk/blob/issue-34/providers/base-provider.js#L485-L495) could be moved to the individual provider classes (ConceptApi and SkosmosApi) e.g. as method registryConfigForScheme(scheme).

I think it would make even more sense to move the registryForScheme method to the CocodaSDK class and then do it like you're suggesting with a method on each provider class.

stefandesu commented 3 years ago

I think the result is saved in rejectedSchemes. To make use of it applications need to catch the error and show a message or whatever is appropriate.

You added the appropriate URI to the BARTOC entry and now it works, right? Because I couldn't reproduce the error.

With the other change from my previous comment, I now (not yet pushed) changed the registryForScheme to cycle through all entries in scheme.API and use the first one that returns an initialized registry. The issue with this is that we don't want to throw an error because the next entry might work. Also, the code is called in adjustScheme which is called for every scheme returned by getSchemes, so I would say throwing an error wouldn't be appropriate.

However, we do need a way to detect errors when using API types. Any suggestion how to do this?

nichtich commented 3 years ago

However, we do need a way to detect errors when using API types. Any suggestion how to do this?

The current way (throwing an error) seems fine.

stefandesu commented 3 years ago

The current way (throwing an error) seems fine.

Ah, I only now understood that the error occurred during getTop, not in registryForScheme. So that's okay then. 👍 What I meant is that we currently don't have any way to detect errors in registryForScheme because it will try multiple things and we don't want to throw an error prematurely. In theory, we could keep track of the errors and if in the end nothing succeeds, we could throw an error containing an array of errors. Something like that.

stefandesu commented 3 years ago

Merged into dev, will be included in the next release.