cantaloupe-project / cantaloupe

High-performance dynamic image server in Java
https://cantaloupe-project.github.io/
Other
266 stars 107 forks source link

Cantaloupe always appends its prefix when constructing URIs #595

Open markpatton opened 2 years ago

markpatton commented 2 years ago

This seems to make it impossible for us to use our own prefix.

Consider the apache reverse proxy configuration of cantaloupe-test.library.jhu.edu:

        RequestHeader set X-Forwarded-Proto HTTPS
        RequestHeader set X-Forwarded-Port 443
        RequestHeader set X-Forwarded-Path /iiif/

        ProxyPass /iiif/ http://localhost:8182/iiif/2/ nocanon
        ProxyPassReverse /iiif/ http://localhost:8182/iiif/2/

A request to https://cantaloupe-test.library.jhu.edu/iiif/rose%2fDouce195%2fDouce195.001r.jp2/info.json returns:

    "@id": "https://cantaloupe-test.library.jhu.edu/iiif/iiif/2/rose%2fDouce195%2fDouce195.001r.jp2",

As you can see, Cantaloupe appends its prefix /iiif/2/ to the the prefix we want /iiif/. This seems like a bug to me. The desired prefix was set with X-Forwarded-Path. That value should replace the existing Cantaloupe prefix and not be appended to it.

If this is in fact the intended behavior, is there anyway I can control how Cantaloupe constructs URIs such that /iiif/ is used instead of /iiif/2/?

regisrob commented 2 years ago

We have the same need, and it turns out to be a blocker to switch to Cantaloupe (we'd prefer to keep our image urls as they are and not have to redirect them from /iiif/ to /iiif/2/)

ewg118 commented 7 months ago

Has this been resolved? I am trying to migrate from Loris to Cantaloupe of upgrading to Ubuntu 22.04 broke Loris, but I am stuck because the info.json URI doesn't reflect the Apache config.

glenrobson commented 7 months ago

Can I ask a couple of clarifications questions:

ewg118 commented 7 months ago

It's a double edged sword because the image URIs are harvested and reused by external partners, so putting the IIIF API version in the URL pattern doesn't conform to the tenet of clean URLs. I can't just migrate from Loris to Cantaloupe seamlessly by updating some lines in the Apache site configuration without breaking the reusability of the URIs downstream. One expedient solution might be to create an additional forward from https://mysite/images/* to https://mysite/iiif/2/ if I can't replicate the URIs exactly from one platform to another, if that makes sense.

markpatton commented 7 months ago

@glenrobson

The issue is that we want to control precisely what the URLs are. This is important in particular when migrating between different image servers. The existing URL structure must be maintained.

As an ugly workaround I have done is something like the below:

        # Fix incorrectly constructed urls
        AddOutputFilterByType SUBSTITUTE application/json
        Substitute "s|/iiif/iiif/2/|/iiif/|n"

        # Fix location header
        Header edit Location /iiif/iiif/2/ /iiif/
glenrobson commented 7 months ago

additional forward from https://mysite/images/* to https://mysite/iiif/2/ if I can't replicate the URIs exactly from one platform to another, if that makes sense.

Yes I think a forward would work and would allow you to upgrade to v3 in the future but updating the forward. One thing to watch out for is you need to set CORS on the forward.

@markpatton would a forward work for you too or do you require the URL to be exactly the same? Have you thought about how you are going to handle multiple versions?

DiegoPino commented 7 months ago

Hi. My 2 cents here. I believe Apache might not be the easiest tool for this (can be, will work, but URL argument extraction, proxying/header manipulation is more complex in Apache). We use NGINX in front of all our services and Cantaloupe is setup as upstream there. That allows us to manipulate all the headers/request URLs/arguments and also serve, if needed multiple aliases/rewrites (native Cantaloupe URLs + any rewrite from your originals e.g your loris or ) to Cantaloupe's native needs. Maybe you could give that a try?

markpatton commented 7 months ago

@glenrobson

Putting a version number in the URL is a choice as is supporting multiple verions. A IIIF image server should be neutral with respect to the URL prefix. Ideally if the server supports multiple versions, that support should be configurable, on or off and what the URL prefix is.

Setting up a reverse proxy as in the original post does work, but is ugly. Also note you need to hack the info.json return and Location headers.

To answer another question, cantaloupe does not respect X-Forwarded-Path correctly. See the original post.