gbv / jskos-server

Web service to access JSKOS data
https://coli-conc.gbv.de/api/
MIT License
6 stars 4 forks source link

Support client-side auto-increment for vocabulary URIs #161

Closed nichtich closed 2 years ago

nichtich commented 2 years ago

Creation of new vocabulary entries in BARTOC requires to mint a new URI in the form http://bartoc.org/en/$NUMBER where $NUMBER is guessed to be numerically the next (causing an error if the guessing did not work). I am not sure whether it makes sense to add support of such auto-increment URIs for concept schemes to jskos-server.

As far as I remember, URIs are only minted (unless the new record already contains field uri) for mappings and annotations based on UUID (in the client, see bin/import.js or on the server, see service/{mappings,annotations}.js).

Proposed solution: Support reverse sorting by numerical URI element at GET /voc. The string used for numerical sorting (counter) is uri.substring(uri.lastIndexOf('/') + 1) and sorting is alway numerical descending, so GET /voc?limit=1&sort=counter will return the vocabulary with highest counter in its URI.

This allows client applications to assign numerically incrementing URIs to vocabularies.

stefandesu commented 2 years ago

This could be done in MongoDB like this:

db.getCollection('terminologies').aggregate([
    {
        $set: {
            uriSuffixNumber: {
                $function: {
                    body: function(uri) {
                        return parseInt(uri.substring(uri.lastIndexOf("/") + 1))
                    },
                    args: [ "$uri" ],
                    lang: "js"
                }
            }
        }
    },
    {
        $sort: { uriSuffixNumber: -1 }
    },
    { $limit: 1 }
])

This is O(N) though, so it has to go through all vocabularies in the database and is not using an index (currently taking about 160 ms in our BARTOC database, so it's not a big deal as long as the query is only used whenever a vocabulary is added). Also, the above query removes all additional fields, so if this is just an option for the sort parameter, it needs to return all data, of course. I will look into it tomorrow.

Edit: We can just use $set instead of $project to add a property.

stefandesu commented 2 years ago

I've just realized that this would conflict with https://github.com/gbv/bartoc.org/issues/167 though. Well, it would still work and there wouldn't be duplicate URIs, but if there's a mix of base URLs for URIs in the database, it will just choose the highest number independent of the base URL. This is such an edge case that I don't think it's an issue though.

stefandesu commented 2 years ago

I have now added the first implementation of this: https://coli-conc.gbv.de/dev-api/voc?sort=counter&order=desc http://esx-206.gbv.de:3013/voc?sort=counter&order=desc

stefandesu commented 2 years ago

Will be part of next week's release if there are no complaints @nichtich. ;)