RDFLib / prez

Prez is a data-configurable Linked Data API framework that delivers profiles of Knowledge Graph data according to the Content Negotiation by Profile standard.
BSD 3-Clause "New" or "Revised" License
18 stars 7 forks source link

Slow API requests #166

Closed jamiefeiss closed 7 months ago

jamiefeiss commented 8 months ago

Both locally and live instances (at least the IDN), the API tends to take over a second to return most responses. I think it would be good to revise what work the API is doing and how we can reduce these wait times where possible.

For example, here is an API call for a pretty small vocab (52 concepts) with Prez running locally pointing to a remote Jena instance:

Image

Here is an API call of a feature collection in the live IDN instance:

Image

recalcitrantsupplant commented 8 months ago

The known major slowdown that got introduced is link generation. This involves first searching for a class for each URI, then requests parent URIs if the link template requires it. A larger vocab Howard showed me had ~250 URIs, so this would be 250 class queries + a smaller number of parent queries. These queries are individually cheap but the volume makes things slow in aggregate. With the current model of looking for a link for anything that is a URI, I think all we can do is cache the links and class information when generated. I've added this link caching to a branch I'll create a PR in the next few days that includes it. Will also try to add class caching. The link caching alone means large initial queries (in Howard's case ~30s), once cached, are <1s.

Other than this, probably just caching of responses in general would help. I haven't figured out how to profile the code yet - but I don't believe any of the other processing e.g. ConnegP is slow - it's all on small volumes of data in memory. Would be good to confirm if possible though.