gbv / jskos-server

Web service to access JSKOS data
https://coli-conc.gbv.de/api/
MIT License
6 stars 4 forks source link

Extend parameter `annotatedFor` on GET /mappings #176

Closed stefandesu closed 2 years ago

stefandesu commented 2 years ago

We need some additional values for parameter annotatedFor on GET /mappings:

Required for https://github.com/gbv/cocoda/issues/605.

stefandesu commented 2 years ago

Unfortunately, there are big performance issues with the new negative assertions (none and !xyz) because it always has to go through the whole result set. There are actually two issues:

  1. Sorting the results takes very long. I was able to mitigate this by moving the sorting process to a different place in the query pipeline, so it's not a problem.
  2. Counting the total results of the query is actually the bigger issue for the above additions. With a big result set, it takes 30+ seconds to count the results because it has to load everything from the drive.

My current workaround for 2 is to skip counting the values and return an unreasonably big number for X-Total-Count (like 9 999 999). Not returning any total count breaks our client applications (and I couldn't find a convention to say "there are more values, but we don't know how many"), and as far as I can see, there is no disadvantage to this except that it's a fake value. I think if this is documented properly, it should be okay.

@nichtich Do you have an opinion on this?

stefandesu commented 2 years ago

I was able to find a different solution for counting the total results for negative assertions for annotatedFor: Count the results for the same query without annotatedFor, then subtract the results from the same query but with opposite annotatedFor. The code changes for this were a little difficult, but it seems to work very well. Closing this issue now!