UUDigitalHumanitieslab / EDPOP

A virtual research environment (VRE) that lets you collect, align and annotate bibliographical and biographical records from several online catalogs.
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

Catalogue caching #163

Closed tijmenbaarda closed 2 months ago

tijmenbaarda commented 4 months ago

Note: I think that this pull request is not yet ready for merging - my goal is rather to obtain feedback (but draft pull requests are not possible here).

In this pull request, I make sure that all records that are requested from catalogues (through edpop-explorer) are saved in Blazegraph in the record graph, as we agreed. The main advantage of doing this is that the records don't have to be sent back from the frontend to the backend. Another advantage could be that data can quickly be re-requested, but because that would require us also to store information about the position of every record in the query results, I went for another solution, namely the caching of readers.

So in addition to storing records in the record graph, with every fetch from an external API the Reader object (from edpop-explorer) is being cached, which allows us to retain both the records that were already fetched and their position in the query results.

I am not completely happy about this solution, and I still have quite some questions about the use of rdflib and Blazegraph, but hopefully this will facilitate discussion.

I use an updated version of edpop-explorer for this PR, the changes of which are explained in this pull request.

jgonggrijp commented 4 months ago

@tijmenbaarda What are you not completely happy about, and what questions do you have about rdflib and Blazegraph? Or would you rather discuss that after I've taken a look at the code?

You can review your own PR, if that helps to associate your comments/questions with particular bits of code. If you want to save those questions/comments until after I've looked at the code, simply keep your comments "pending" (by not "submitting your review").

tijmenbaarda commented 4 months ago

Thank you very much for the thorough comments, which are very helpful!

I will address them in the coming days, but I think I will need some more time. In the meantime, I will make a new PR where I will just propose to merge the commits that deal with the generic Blazegraph access functionality, so that @lukavdplas can continue with his work on the annotation APIs without waiting for me.

tijmenbaarda commented 3 months ago

I have done some additional outfactoring as requested and added unit tests where relevant. I think this may be ready to merge now, therefore I ask for another review.