uoregon-libraries / rais-image-server

RAIS: A IIIF-compliant, 100% open source image server for blazing-fast deep zooming
Creative Commons Zero v1.0 Universal
78 stars 6 forks source link

Use JP2 streaming functions instead of file-opening functions #19

Closed jechols closed 4 years ago

jechols commented 5 years ago

OpenJPEG has API functions for reading from a stream rather than opening a file on disk. Implementing this may not be trivial, but it shouldn't be too bad, and it could open us up to some performance improvements if we wanted to cache a small selection of most recent JP2s in memory / read from S3 directly instead of copying the file / etc.

It would be valuable to do performance testing against S3-streamed vs. in-memory vs. on-disk JP2s if we implement this. If there's not a decent gain, that would be unfortunate, but good to know. If there is, it would be good to rebuild the S3 plugin to stream as well as optionally caching JP2s in RAM for small exhibits that need really fast tiles.

jechols commented 5 years ago

This will take a lot more time than expected. A quick prototype proved this was easily done, and could dramatically simplify the S3 plugin, but it won't be done terribly quickly.

The filesystem is assumed to be the source of all images and info.json overrides. It's also used directly in the DZI handlers instead of just having those call the appropriate IIIF handlers. It being an experimental shim just to see if it was doable, I guess this shouldn't surprise me.

I think the code that registers decoders is probably where we'll want to fix this. Instead of just registering decoding functions by file extension, we should also have image readers or something that are registered and processed in order. We'd attempt to read from the filesystem if none of the other readers (plugins?) matched on the id. This would, for instance, let the S3 plugin register its reader and just respond when the IIIF id starts with "s3:". The reader would know how to read the image resource and return its info.json response, something like this:

type Streamer interface {
    io.ReadSeeker
    Free() error
}

type ImageReader interface {
    Stream() Streamer
    GetInfo() *iiif.Info
}

Decoders would need to change as well. The IIIFImageDecoder interface is currently living in the main namespace, which would be awful to use in plugins. The iiif package probably makes more sense. The registration of decoders would need to change as well so that the function takes a Streamer instead of a filename.

Some of the streaming work I've done is in the feature/streaming-jp2s branch. It's messy and broken, but it's got the original prototype work that solved one of the tougher bits of this (interacting with openjpeg C streaming APIs).

The above info will undoubtedly prove not to be entirely correct if/when we implement it, but I hope it at least gives us a bit of direction.