IIIF / api

Source for API and model specifications documents (api and model)
http://iiif.io/api
107 stars 54 forks source link

Define a probe service #1290

Closed azaroth42 closed 1 year ago

azaroth42 commented 7 years ago

Split from #547.

Issue:

In order to allow the IIIF Authentication workflow to work with non IIIF Image API resources, such as audio or video resources, there needs to be a way to test whether or not the browser/client has the right credentials to get access to the content. For the Image API, the info.json document can fulfill this role (as it represents the set of image derivatives available from the service), however there needs to be an equivalent for others.

Proposal:

Define a new "probe" authentication service. Clients wanting to know whether or not the user has the credentials to retrieve the content resource SHOULD make a HEAD request on the URI which is the id of the probe service. The HTTP status code that it receives in the response MUST be the same status code as the browser would receive interacting with the content resource. The URI of the service MAY be the URI of the content resource. The service MUST allow all of the CORS affordances, including Access-Control-Allow-Origin, Access-Control-Allow-Headers, the pre-flight OPTIONS request must return 200, and so forth. As an implementation note, the use of the content resource for the service combined with the CORS requirement might have unexpected consequences.

A concern exists about the use of the content resource as a service in the RDF scope, as it means asserting the probe profile on the same URI as the content resource, which would result in:

{
  "id": "content-resource",
  "type": ["Video", "Service"]
  "profile": ["video-profile", "probe-profile"],
  "service": [{
    "profile": "services-service",
    "service": {
        "profile": "login",
        "service": {
            "id": "content-resource"
        }
     }
  }]
}
tomcrane commented 6 years ago

Proof of Concept required.

  1. A common implementation pattern for adaptive bitrate delivery would be to have the MPEG-DASH manifest (or HLS equivalent) act as the probe. Reference @irv's experiments.

  2. Need to define how the degraded flow works. I have an mp3 which has parts redacted, that anyone can play, and another complete mp3, that can only be played in the reading room. Should be OK.

tomcrane commented 6 years ago

Here's a first pass at an approach to this:

https://github.com/digirati-co-uk/iiif-auth-server/blob/master/non-service-auth.md

And running here:

https://digirati-co-uk.github.io/iiif-auth-client/?sources=https://iiifauth.digtest.co.uk/index.json

It all works fine in Firefox, but I'm having a problem in Chrome when you go back to something you've already looked at and the probe service is the content resource itself. If I set an explicit cache-control on the response to the GET for the content (e.g., the video):

Cache-Control: public, max-age=43200

...Chrome doesn't make a new request for a subsequent probe HEAD request to the same resource - although having said that I can't get it to repeat that behaviour!

tomcrane commented 6 years ago

Additional links:

Probe service is content resource (default, assumed behaviour, just as the description resource is assumed to be the probe now): https://iiifauth.digtest.co.uk/manifest/21_av_lego

Probe service is something else: https://iiifauth.digtest.co.uk/manifest/22_av_stars

...avoids ever having to say:

"type": ["Video", "Service"]

tomcrane commented 6 years ago

And an example of auth on a resource linked via rendering:

https://iiifauth.digtest.co.uk/manifest/24_pdf_prezi3

tomcrane commented 6 years ago

Update from thrashing through the use case with BL...

For a content resource, you can't detect the 302 status code. The technique used for image services (comparing the id of the returned resource with the URL you asked for) won't work because there's no JSON body with an id.

One approach (that differs from the flow diagram) is to make the probe service a GET. The probe then returns a data object that can state that the user's interaction with the content resource, given the current credentials represented by the token, would have resulted in a redirected response.

fyi @irv @edsilv

tomcrane commented 6 years ago

The British Library requires the degraded scenario from the start, so this approach is not enough:

https://raw.githubusercontent.com/digirati-co-uk/iiif-auth-server/master/auth-with-probe.png

...need to solve this in a way that:

The basic problem... how to detect that a response is degraded. This feels like we need to return a document in the probe service, which makes it a GET. I was trying to keep it as a HEAD (so the flow is simpler... the client establishes whether the probe service is the CR itself or some other resource, then makes a HEAD request to it regardless).

We want clients to be able to use HEAD, because we don't want to start GET-ting very large AV resources to determine their HTTP status codes.

And having already required a HEAD, it was simpler that it's always a HEAD.

But it seems like we will have to complicate it so that:

...which means we have to have more spec; we need to define the format of the probe service, rather than relying on pure HTTP to tell us what we want to know.

tomcrane commented 6 years ago

which means we have to have more spec; we need to define the format of the probe service, rather than relying on pure HTTP to tell us what we want to know.

At time of writing, this manifest:

https://iiifauth.digtest.co.uk/manifest/22_av_stars

...references this probe service:

https://iiifauth.digtest.co.uk/probe/22_av_stars.mp4

The response body of that probe service is currently irrelevant, it's only the HTTP status code we are interested in. The above comment implies that we need to define what the body of this probe service looks like, and how it conveys access:

{
   ...
   "statusCodeThatYouWouldHaveGotForTheContentResourceIAmTheProbeFor": 302
}

(a straw man, obviously)

tomcrane commented 6 years ago

Alternatives:

The probe service doesn't return a "you would have got this status code" message, it just returns the content location of where the CR request will be redirected to if made with the cookie corresponding to the provided token (if the client has a token).

The client can then compare this URL with the @id of the content resource the probe service is declared for, and predict that a redirect will happen. This feels similar to the info.json pattern. It's not the @id of the probe service though, it's a different URL.

The GET on the probe service should still result in HTTP 401 if that is what the client would get for the CR - it's only for status codes invisible to XHR that we should instead return HTTP 200 and provide additional info in the response.

{
    "contentLocation": "https://example.org/video/degraded.mp4"
}

Client can then compare the contentLocation property with the @id of the content resource.

tomcrane commented 6 years ago

Alternative

We could use CORS Access-Control-Allow-Headers, and allow Location or Content-Location, and have the client read that, and stay in HTTP only - but then we'd be using this on the probe service when we mean it for the Content Resource, which is messy too.

Plus - can the XHR client see these headers after a redirect? The probe service would have to return these on a 200 response, for which Location has no meaning.

tomcrane commented 6 years ago

Updated version of previous diagram, assuming that probe service if present is a GET, and conveys (in a manner tbd) a content location for the resource it is a service for:

iiif auth with probe client perspective

tomcrane commented 6 years ago

What does the "Display" box mean in the IIIF Auth spec?

image

This is important as it decides what the spec needs to concern itself with when applied to more complex content interactions such as adaptive bit rate content, as in https://github.com/UniversalViewer/universalviewer/issues/608#issuecomment-414704326

I think the stance is:

At that point, you have left the IIIF Auth Spec behind. The client's job is now to display that resource. For almost all content, the client leaves its IIIF implementation and passes the URL of the content resource to the browser in some way, for the browser to make a simple request (i.e., not one made by XHR and affected by CORS). This might mean setting the src attribute of an img, video or audio tag. Crucially, this kind of request will include the cookie acquired during the login flow. That is, the IIIF auth spec's responsibility is to get the client to a point where the cookie has been acquired, then let whatever kind of request is appropriate for the content happen, using that cookie.

If the content resource is something fancy, such as an MPEG-DASH manifest, it is the client's responsibility to deal with the extra complexities of sending that cookie. In the IIIF spec we have avoided the need for the client to make a .withCredentials XHR request. If the client has to do that to get what it needs immediately after leaving the IIIF part of the flow, then so be it - the IIIF part of the flow has still done its job as simply as possible.

For adaptive bit rate content in particular, a client such as the UV knows that it's just been through the IIIF auth flow, knows that the content is an MPD or HLS manifest, knows that the content has auth services attached, and therefore concludes that it should force a credentialled XHR request when it gets to the "Display" part, because that's what "Display" means for that kind of special content. It can also decide NOT to make a credentialled request for an mpd/hls manifest when it didn't spot auth services earlier on.

The important thing is that these details are beyond the IIIF spec. They are good cookbook entries, examples of IIIF use on real world problems, but they don't require that the spec gets into the weeds of third-party library workarounds for the lack of native adaptive bit rate support in browsers (at which point, the mpd/hls manifest request would become a simple http request).

The BL's solution (that is, @irv's implementation) will take that credentialled request for the HLS/MPD manifest and redirect to a session-specific manifest that inserts short lived access tokens into the URLs of all the content parts, so that the fragment requests made by the adaptive bitrate client library are authed by the presence of a short lived, obscure access token in the URL.

tomcrane commented 3 years ago

(with @irv and @stephenwf)

Ran into a potential issue with the probe service proposal as presented in this manifest - https://iiifauth.digtest.co.uk/manifest/22_av_stars

Ignore the pre-Presentation 3 JSON, the main point is that the probe service sits alongside the token service (and the optional logout service in this case) as members of the service property of the Cookie service. This makes sense, because the probe service, like the token service, needs to live in the context of the credentials acquired from a particular Cookie (login) service.

That's fine... until you want to take advantage of the services property to avoid repetition of the auth info in a manifest that has multiple resources protected by the same Cookie service.

https://iiif.io/api/presentation/3.0/#services

The probe service is a child of this cookie service, but it has to be a different probe service for each resource - the probe service is for the resource itself.

If you just include the probe service on each inline use of the login service, and leave the client to fill in the blanks from the definition in the services block, it works fine if you build your synthesised cookie service from the point of view of that resource, but if you evaluate the whole manifest (e.g., as JSON-LD) you are effectively saying that your Cookie service has ALL these probe services.

In fact, it's still a problem if you don't use the services block (that just makes the problem more obvious). The cookie service is the same each time but its probe service is different, depending on its grandparent content resource. The evaluation of the whole graph results in the one Cookie service having all the probe services.

This doesn't invalidate the idea of the probe service, just the proposal for how it is associated with the resource and the cookie service.

tomcrane commented 2 years ago

Coming back to the above comment:

I think the problem goes away if the probe service is on the resource itself, alongside the cookie service (renamed access service in Auth 2.0 draft), rather than a child service of the cookie service. This allows/requires the same probe service to handle credentials from different cookie services.

This makes sense if you think about the case where a resource is its own probe service, as with an info.json. The info.json response would also have to handle credentials from any child service.

zimeon commented 1 year ago

Resolved. The IIIF Authorization Flow 2.0.0 specification was published 2023-06-02: https://iiif.io/api/search/2.0/ -- includes probe service