buda-base / public-digital-library

http://library.bdrc.io
5 stars 6 forks source link

combine queries to speed up loading #941

Open berger-n opened 1 month ago

berger-n commented 1 month ago

for example on https://library-dev.bdrc.io/show/bdr:MW22084

eroux commented 1 month ago

Thanks! Let's start with a query that combines instance, scan and work. It should also have enough data to make another quick query for the etext snippet, without having to call the query for the etext outline.

I think it's fine if the outline stays in a different query for now. But maybe the website can detect that the user is on a MW and just start the query of the outline before it gets the return of the main query?

berger-n commented 1 month ago

maybe the website can detect that the user is on a MW and just start the query of the outline before it gets the return of the main query

not so sure about this... this was a long time ago but see for example https://github.com/buda-base/public-digital-library/issues/641 or the commit in https://github.com/buda-base/public-digital-library/issues/754 (there are several cases, it's not a one-size-fits-all query)

eroux commented 1 month ago

Ok, the first step is the following: you will get all the information about the work and reproductions for an instance through

http://ldspdi-dev.bdrc.io/query/graph/ResInfo-SameAs_MW?R_RES=bdr%3AW22084&format=json

berger-n commented 1 month ago

thanks! now loading all the MW / W / WA / IE at once: https://library-dev.bdrc.io/show/bdr:MW22084

image

eroux commented 1 month ago

thanks! that's much faster, although still some room for improvement for the etext... the best for that would probably be to have a dedicated endpoint for the snippet, I'll try to do that next week

eroux commented 1 month ago

ok, so I think a faster way to get to less etext queries would be to just call the endpoint to get the first X characters of the etext (like https://ldspdi.bdrc.io/resource/UTXXX.txt?startChar=s&endChar=e). And we can just display it as is, without much presentation (no links to images, etc.). @berger-n do you have all the information you need to do that in the main query (you need the VL, UT and start/endchar in the case where the record is an outline node)

eroux commented 1 month ago

Thanks! Now I'm actually wondering why the collection and manifest are queried, we have the URL of the thumbnail from the query right, we could just display it?

berger-n commented 1 month ago

it's almost working indeed and looks promisingly faster! an eTextVolumeForImageGroup in the EtextVolume data could come in handy though (to display the images)

+ there's also the warning about OCR that's missing (but the data is here already, just something minor to fix I guess) I'll need to check the unaligned and copyrighted cases as well

regarding the manifest I think it has uses beyond the thumbnail (like whether there's a download link or not)

berger-n commented 1 month ago

note that in the case of https://library-dev.bdrc.io/show/bdr:MW22084_0044_2 there's an additional call to ResInfo_MW (regarding the root instance) which maybe could be merged in the first one?

eroux commented 1 month ago

ah yes, much better, thanks!

Let's merge also the call to the root MW.

I'm not really persuaded we need to have an option to diplay the images with the etext in the preview, it seems a bit too much for the preview... what do you think @roopeux ?

I think in terms of UX, the preview of the etext should be bound vertically to a relatively short thing that users can scroll, now it takes too much space I think

Yes I kind of remember complicated cases for the manifest... but maybe we can find a way to improve it too, I think we've reached a pretty good state here so it's maybe less of a priority, but at some point can you list the uses of the manifest so we can see if we can replace it?

eroux commented 1 month ago

P.S.: can you tell me what you need from the root instance?

roopeux commented 1 month ago

I agree, the etext preview should be visually about the same size as the scans. I don't think it even adds much if users can scroll the preview. It could be pretty short and click on anywhere in that div would open the etext viewer.

So if displaying images from etext preview introduces any kind of tech complication it should be disabled.

berger-n commented 1 month ago

an attempt at a shorter, non scrollable, fully clickable etext preview:

image


the last example is still a bit heavy to load (because of the outline queries) but the others are now quite fast (here there's no outline, or it is loaded only when asked):


regarding the iiif manifests, having a look at https://library.bdrc.io/show/bdr:MW22084_0045 reminds me that we used to find the first image of a "subinstance" from them (which has not been reimplemented yet in the new UX, though it should be straightforward using the data that's already there)

but could there be this dedicated thumbnail like in https://ldspdi-dev.bdrc.io/query/graph/ResInfo-SameAs_MW?R_RES=bdr%3AMW22084_0045&format=json?

and also the scan info is not in the new query, but is shown after loading the outline/root query (for example here: https://library-dev.bdrc.io/show/bdr:MW22084_0051)

eroux commented 1 month ago

that's really good, this is a great improvement!