Closed edmondchuc closed 9 months ago
Related issue: https://github.com/RDFLib/prez/issues/166
Expect when caching is introduced, it may help. For large vocabs, the initial request will still have issues. Conditional prez-link generation may be needed. Issue potentially where there is a large number of top concepts. Further investigation required.
Solved in the current set of changes by using a simple cache, however the initial request load time is not improved.
Other options to discuss:
Meeting to be scheduled for next Monday.
My current understanding:
/object
requests are slow because regardless of what the class type is or what profile it uses, it always describes the entire object and processes the IRIs found to prez links./s
, /c
)
skos:hasTopConcept
when rendering a vocabdcat:Dataset
with many relationships (dcterms:hasPart
to dcat:Resource
objects, for example./s
and /c
systems.@recalcitrantsupplant please check this out and let's have a discussion on this on Monday in our design session, thanks.
I ran the link generation through a debugger last night and the RDFlib query link to below is slow - it takes seconds, not sure if the performance regressed after some changes I made to it but regardless it shouldn't take that long. I switched it out for PyOxigraph and it's now in the milliseconds. I think this will resolve most of the issues.
skos:hasTopConcept
values to be included. If included, it will perform the expensive processing to get the prez links./object
endpoint with this resource https://bgs.dev.kurrawong.ai/v/vocab/rf:Lexicon.Closing as completed
Currently, the
/object
endpoint takes an IRI and loads an object. It's slow because it does an N+1 database query for each IRI it finds within the resource to generate the prez links. So for a concept scheme with hundreds of top concepts, it will query the database for each one and attempt to generate a prez link. As you have mentioned in the past, this is the naive solution because it's slow but correct.Is there any way where we can use some of the info in the profiles to explicitly ignore relationships we don't want included during prez link generation? This kind of handling is similar to the custom queries we have now to handle the VocPrez endpoints for the progressive loading of large vocabs because we needed to ensure we were only loading the proximate information that's needed to render the UI and not a description of the entire vocab (like a SPARQL DESCRIBE query would).
I'm raising this because in the BGS case, they have very large vocabularies with hundreds or thousands of concepts and a render on the /object endpoint is extremely slow. For example, https://data-uat.bgs.ac.uk/object?uri=http://data.bgs.ac.uk/ref/Geochronology. If we can somehow instruct Prez to not include certain properties on routes like skos:hasTopConcept/skos:topConceptOf, then the loading of this vocab on the /object endpoint will be very fast, I think.