gbv / cocoda-sdk

SDK for Cocoda and coli-conc services
https://gbv.github.io/cocoda-sdk/
MIT License
5 stars 1 forks source link

Add MyCoRE classification provider #50

Closed nichtich closed 1 year ago

nichtich commented 1 year ago

MyCoRE is used by VZG and other organization for digital collections (around 70 instances). The software includes support of classification (see documentation in German) with

The MyCoRE API includes three method with response format XML or JSON:

  1. list all classifications
  2. get a classification
  3. get a category (aka concept) - seems to be not available yet

Example API calls at https://bibliographie.schleswig-holstein.de:

nichtich commented 1 year ago

I've added MyCoRE Classification API to BARTOC although for a particular vocabulary it's just a plain URL such as https://bibliographie.schleswig-holstein.de/api/v2/classifications/shbib_sachgruppen.json

As most "standard" vocabularies seem to be imported in many MyCoRE instances, BARTOC should only list vocabularies original to a MyCoRE instance.

The JSON format based on MyCoRE classification data model is:

The full concept URIs must be build from vocabulary URI and concept ID (e.g. http://www.mycore.org/classifications/shbib_sachgruppen + 011000 should become http://www.mycore.org/classifications/shbib_sachgruppen/011000

All queries except for the list of vocabularies (not required for current use cases as we use BARTOC to point to particular vocabularies) require to load the full vocabulary in one HTTP request and cache it (open question: how long? Most vocabularies don't change quickly so at least one day cache should be ok).. This may also be used in additional providers for small vocabularies:

Alternative: convert MyCoRE classification format to JSKOS and import into JSKOS server

stefandesu commented 1 year ago

require to load the full vocabulary in one HTTP request and cache it

As I see it, https://bibliographie.schleswig-holstein.de/api/v2/classifications/shbib_sachgruppen.json already contains all required data, right?

I think as long as vocabularies don't get too big (since we need to keep it all in memory), it should be fine. I could start to implement it tomorrow. In theory, it shouldn't be a big thing as long as the API results are consistent.

stefandesu commented 1 year ago

Stupid thing, but for completeness: It's "MyCoRe", not "MyCoRE". The provider name is correctly named, but the documentation might not be fully consistent with it.

stefandesu commented 1 year ago

Currently only works with one specified vocabulary per registry (not an issue since it's used via BARTOC most of the time anyway). Listing of vocabularies is currently not supported. Will be release as experimental in v3.3.0.