image resolving delegation

dbmdz / iiif-server-hymir

Hymir is a Java based IIIF Server. It is based on "IIIF Image API Java Libraries" and "IIIF Presentation API Java Libraries" projects (see https://github.com/dbmdz)

MIT License

27 stars 7 forks source link

image resolving delegation #7

Closed eroux closed 6 years ago

eroux commented 6 years ago

We would like to use hymir for our new production system. All our images are on S3 (a few TB) so we need to customize the way hymir gets the image data based on the identifier of the image (and the http headers for authentication). The ideal for us would be to write this mechanism in Java, one possible mechanism could be to have an abstract class in hymir defining a few functions, that we could subclass (in a separate .jar) and have a way to configure hymir so that it uses our class for image resolving and auth.

I don't think hymir currently allows this mechanism, but we would be ready to implement it and propose some pull requests. Would that be reasonable? wdyt?

jbaiter commented 6 years ago

Yes, this would absolutely be reasonable. Currently the resolving mechanism is very dispersed across a few packages and assumes that all identifiers are resolved to an URI that Spring's ResourceLoader can retrieve the data from. If you find a good way to make this more versatile for cases like yours, we'd very much welcome a pull request!

A good approach would probably be an image.backend.api.ImageRepository API for retrieving the source images, with a default implementation that uses our ResourceService.

datazuul commented 6 years ago

This is how currently images are read (see method getInputStream): https://github.com/dbmdz/digitalcollections-core/blob/master/dc-core-backend/dc-core-backend-file/src/main/java/de/digitalcollections/core/backend/impl/file/repository/resource/ResourceRepositoryImpl.java#L116

It is in our DigitalCollections Core library.

jbaiter commented 6 years ago

@datazuul is right of course, you can just supply your own implementation (that e.g. resolves S3 URIs) of ResourceRepository and exclude our default implementation. If you have any questions or find that the API is too cumbersome, ping us and we'll work something out :-)

MarcAgate commented 6 years ago

Thanks! We are going to dive into all that !

datazuul commented 6 years ago

Is this how to get an inputstream from Amazon S3? https://docs.aws.amazon.com/AmazonS3/latest/dev/RetrievingObjectUsingJava.html

AmazonS3 s3Client = new AmazonS3Client(new ProfileCredentialsProvider());
S3Object object = s3Client.getObject(new GetObjectRequest(bucketName, key));
InputStream objectData = object.getObjectContent();
// Process the objectData stream.
objectData.close();

MarcAgate commented 6 years ago

@jbaiter I don't understand clearly what you mean by "exclude our default implementation", since I think we want to provide some way in hymir to pick up the implementation we want to use from the global config of the server (delegating the job to the right service).

MarcAgate commented 6 years ago

@datazuul Yes, at a very basic level, that's how we get objects from S3 (after validating both the user profile and the objects access rights)

jbaiter commented 6 years ago

@jbaiter I don't understand clearly what you mean by "exclude our default implementation", since I think we want to provide some way in hymir to pick up the implementation we want to use from the global config of the server (delegating the job to the right service).

Excluding was meant that you exclude our default implementation from Spring's component scan and instead make sure that your own ResourceRepository implementation is the only one available in the context.

datazuul commented 6 years ago

I created a ready to run gist for a customized version of Hymir. Start implementing Amazon S3 stuff in S3ResourceRepositoryImpl.java. (see our file based implementation as reference: https://github.com/dbmdz/digitalcollections-core/blob/master/dc-core-backend/dc-core-backend-file/src/main/java/de/digitalcollections/core/backend/impl/file/repository/resource/ResourceRepositoryImpl.java)

Would be great, if you publish this custom version on github as opensource so that we can give feedback if you run into problems and later on backport your implementation into hymir.

Just create a new Maven project containing only the files of this gist: https://gist.github.com/datazuul/96820643e6f1fca1ab2699eff144bbb5

The standard hymir is pulled as dependency, so your customized ready to run hymir is really just the gist's handful of files.

eroux commented 6 years ago

I think this can be closed, we have an instance of hymir deployed that fetches images on s3!