DDMAL / musiclibs

:guitar: Searching IIIF Manifests
Other
6 stars 2 forks source link

Thumbnail saving? #114

Open agpar opened 8 years ago

agpar commented 8 years ago

I'm interested if you think it would be worth it to save thumbnails and serve them ourselves. It's pretty underwhelming (the steak lacks serious sizzle) when every thumbnail on the site takes 5-15 seconds to load (if they load).

Pros:

Cons:

Do you think its worth it?

ahankinson commented 8 years ago

I think the answer is to get those services to use IIIF thumbnails. Could you make a note of them on the problems document?

https://docs.google.com/document/d/1Lyg7FzBeTHBeC0qbPmFzJfyxpK9ySBKyoiYs0TQkCIs/edit#

In general, I'm hesitant to do this. For DIAMM, for example, if a source does not have a pre-defined thumbnail, I choose a random one from the available page images for the manifest. So the thumbnail actually changes from load to load.

Why not go halfway and set up a memcached caching layer, with the thumbnails cached by URL key. When a thumbnail is requested we can check the cache for it and serve it quickly; otherwise, we fetch and then cache.

agpar commented 8 years ago

The problem with only caching is that we display a random .01% of the thumbnails every time the user loads the front page.

We could re-write how it chooses manifests to display on the front page so it only updates them nightly or weekly, thus making a cache system useful.

I like being able to refresh the page and see a whole new set of interesting stuff though.

ahankinson commented 8 years ago

Good point, but I think choosing a solution to this should be independent of the front-end features.

I'm only saying this because I'm finding that when I present the front page it's a bit strange to explain the stuff that is there. They're not really "Selected items" since we didn't select them, and saying "Random items" just seems a bit too indeterminate. So I'm wondering if that space would be better used for something else (browse by provider? Not sure...) That's a separate issue, but since @jeromepl is wanting a caching layer as well, it might make sense to provide a general solution.

agpar commented 8 years ago

I'm also think of search result thumbnails... We only want to show, like, 30kb of image, but they are often extremely slow to load, since we are typically asking for a dozen of them all at the exact same time from the same server.

I agree, explaining why the front page is half empty seriously sucks.

I think writing a system to retrieve, convert, save, and make thumbnails available would be a very robust long term solution. We could do whatever we wanted with thumbnails if it was in place.

Caching would be helpful to this solution, but would probably not be sufficient alone. If every thumbnail is 100kb, then we already have 7 gb or so that need to be cached. More realistic to keep these on an HD than to have a caching system that will need to be warmed up every time we update (could take hours, given that each thumbnail is a GET).

agpar commented 8 years ago

BTW: we do a similar thing in musiclibs with regards to thumbnails. If one is not provided, or if we know that library uses large images as thumbnails, we just grab a random one from the default sequence. It's still very slow.

ahankinson commented 8 years ago

The problem is that the 'magic' of the site is that we're not serving other people's content, though. Plus it might get us into a bit of a sticky situation WRT copyright if we're storing and re-hosting content.

agpar commented 8 years ago

Yeah, I want the magic to be there too. I'd really like to make the site browsing experience better though. I wonder if it's something that will just resolve itself as IIIF image service implementations improve?