gbv / jskos-proxy

HTTP proxy to serve JSKOS objects
https://uri.gbv.de/terminology/
MIT License
3 stars 2 forks source link

Timeout when opening a vocabulary #46

Open StiftungAusNachlass opened 2 months ago

StiftungAusNachlass commented 2 months ago

When I open a vocabulary, all records at top level are fetched via /top

Example: https://uri.gbv.de/terminology/http%3A%2F%2Fet-iblk.org%2Fscheme%2F/ (https://uri.gbv.de/terminology/etiras/) calls

https://api.dante.gbv.de/voc/top?properties=%2Bcreated,issued,modified,editorialNote,scopeNote,note,definition,mappings,location&limit=10000&uri=http:%2F%2Fet-iblk.org%2Fscheme%2F&language=en,de,fr,es,nl,it,fi,pl,ru,cs,jp

If this is not in the cache, it will take some time and will be aborted. There seems to be a timeout. To simulate the timeout, you could add the parameter cache=0 to the query. The maximum limit for dante-api is 1000. No need for 10.000.

Of course, a smaller query would be better, because no one can look at 1000 data records in the small scroll window anyway. Even better: pagination. :-)

stefandesu commented 2 months ago

I agree that 1000 records are a lot, but are a lot of pages with a smaller number of records each really that much better? I'd rather scroll than click through dozens of pages. Pagination for the top concepts would also make other things (in particular showing a certain concept in its context, i.e. the hierarchy) a lot more complicated.

I wonder why the request takes that much time anyway. 43 seconds (during my last test) seems a lot to me for a request that should be trivial (and indexed). And the data transferred is not that much either (410 kB compressed). In general, I noticed during development that a lot of requests to DANTE take unreasonably long if not in cache. When I hooked it up to our own API based on jskos-server, everything felt 10 times faster (because most requests are fulfilled near instantly).

StiftungAusNachlass commented 2 months ago

It's not without reason that the graphic designer included pagination. "..." helps limit the paginator's view.

Unfortunately, getting the results from the easdb5 application takes so long. In this case it takes an extra long time because the "mappings" are queried in the properties and it has to search for and parse the other records in order to get the information for the mapping. Without mapping-property it is "only" 13 seconds instead of 35. Maybe it's enough to remove "mapping" at this point?

Bildschirmfoto 2024-09-04 um 13 08 29
nichtich commented 2 months ago

Pagination as layout is independent from paginated queries. The list can also be shown as whole but loaded in batches.

stefandesu commented 2 months ago

Maybe it's enough to remove "mapping" at this point?

Mappings and other additional data are now only loaded when a particular concept is selected (so far only in Dev). This should reduce loading times enough so that this is less of a problem.

I do like the idea of loading the list in batches. This would make it usable quickly while additional items are loaded in the background. I can look into it.