loris-imageserver / loris

Loris IIIF Image Server
Other
208 stars 87 forks source link

SimpleHTTPResolver to public S3 bucket #537

Open paroar opened 3 years ago

paroar commented 3 years ago

Hi,

I'm trying to set up a simpleHTTPResolver running on docker with loris-grok-docker image to an s3 public bucket that I can access from the browser, without luck. Here's what I tried (one at a time):

SimpleFSResolver

[resolver]
impl='loris.resolver.SimpleFSResolver'
src_img_root='/usr/local/share/images'

then fetching works fine and I get the image

http://localhost:5004/image-identifier.jpg/full/full/0/default.jpg

SimpleHTTPResolver

[resolver]
impl = 'loris.resolver.SimpleHTTPResolver'
source_prefix='https://my-bucket.s3.eu-west-3.amazonaws.com/'
cache_root='/usr/local/share/images/loris'

then fetching gives an error 404

http://localhost:5004/image-identifier.jpg/full/full/0/default.jpg

[resolver]
impl = 'loris.resolver.SimpleHTTPResolver'
uri_resolvable=True
cache_root='/usr/local/share/images/loris'

then fetching gives an error 404

http://localhost:5004/https://my-bucket.s3.eu-west-3.amazonaws.com/image-identifier.jpg/full/full/0/default.jpg

Maybe I'm misunderstanding how it works and would appreciate any help.

Thanks!

bcail commented 3 years ago

what version of loris are you using? Is there any output in the loris log?

paroar commented 3 years ago

The docker image currently uses an old Loris version, 2.0.1

With source_prefix:

2021-05-20 15:07:21,197 (werkzeug) [INFO]:  * Running on http://0.0.0.0:5004/ (Press CTRL+C to quit)
2021-05-20 15:07:21,198 (werkzeug) [INFO]:  * Restarting with stat
2021-05-20 15:10:04,727 (werkzeug) [INFO]: 172.30.0.1 - - [20/May/2021 15:10:04] "GET /image-identifier.jpg/full/full/0/default.jpg HTTP/1.1" 404 -

With uri_resolvable:

2021-05-20 15:12:52,114 (werkzeug) [INFO]:  * Running on http://0.0.0.0:5004/ (Press CTRL+C to quit)
2021-05-20 15:12:52,115 (werkzeug) [INFO]:  * Restarting with stat
2021-05-20 15:15:57,447 (werkzeug) [INFO]: 172.31.0.1 - - [20/May/2021 15:15:57] "GET /https://my-bucket.s3.eu-west-3.amazonaws.com/image-identifier.jpg/full/full/0/default.jpg HTTP/1.1" 404 -
bcail commented 3 years ago

ok, please try it with the latest (v3.2.1) and see what output you get. We are using loris with the SimpleHttpResolver (although not with S3) - we're using a source prefix and source suffix.

ambs commented 3 years ago

Just to let you know I have a similar configuration, and the correct file is being retrieved by my S3 server (local MINio server). Meanwhile, I have a problem and a question:

bcail commented 3 years ago

@ambs so Loris is returning an empty string? Is there any output in the loris log?

For S3 authentication, does https://github.com/loris-imageserver/loris/pull/257 help at all?

ambs commented 3 years ago

Meanwhile noticed I was using a non standard webapp.py file. I am using the proper one from this repository. But while I have a simpleHTTPResolver in the config file, logs still say:

2021-06-07 15:05:37,284 (root) [DEBUG]: resolver.impl=loris.resolver.SimpleFSResolver
2021-06-07 15:05:37,284 (root) [DEBUG]: resolver.source_prefix=https://fs.africamediaonline.com/

curious, as that source prefix is exactly in the line after the resolver impl option :-O

bcail commented 3 years ago

do you have multiple resolvers listed in your config? I would suggest commenting out all lines in your config that you're not using.

ambs commented 3 years ago

@bcail I think so: https://paste.debian.net/1200300/

bcail commented 3 years ago

try restarting everything, and see if there's any error in the log when you try the http resolver.

ambs commented 3 years ago

Full startup log here: https://paste.debian.net/1200306/ Then, when accessing it, it looks like it is searching locally:

loris_loris.1.w09qtluc3han@s3.africamedaionline.com    | 2021-06-07 17:08:29,188 (loris.resolver) [WARNING]: Source image not found for identifier: 1103%2F85%2F85_273.jpg.
loris_loris.1.w09qtluc3han@s3.africamedaionline.com    | /opt/loris/loris/webapp.py:202: DeprecationWarning: 'BaseResponse' is deprecated and will be removed in Werkzeug 2.1. 'Response' now includes the functionality directly.
loris_loris.1.w09qtluc3han@s3.africamedaionline.com    |   super(LorisResponse, self).__init__(response=response, status=status, content_type=content_type)
loris_loris.1.w09qtluc3han@s3.africamedaionline.com    | 2021-06-07 17:08:29,194 (werkzeug) [INFO]: 10.0.7.245 - - [07/Jun/2021 17:08:29] "GET /1103%2F85%2F85_273.jpg/info.json HTTP/1.1" 404 -

The curious thing is that with the web app from here: https://github.com/bodleian/loris-grok-docker and after changing the code from python2 to python3, I managed to get it "working" (kind of!) :-)

bcail commented 3 years ago

that's good you got it working

ambs commented 3 years ago

I did not explain myself clearly. I had it working before. But given the code that was running was 5 years old, I decided to use the webapp from this repository. And it's with this new version that I am getting these problems on SimpleHTTPResolver.

I will keep debugging it and let you know my findings.

ambs commented 3 years ago

This patch is working for me: https://github.com/loris-imageserver/loris/pull/538 @paroar see if it helps you too.

Will look into authentication next.