Open mikewest opened 9 years ago
+@tyoshino
cc @mnot - I'm a bit confused on the context - is this saying that 2 different uris with 206 responses should be stitched together just because they both had the same original uri before redirection? (and if they pass cors). That seems odd - they're different resources.
This scheme is already in use widely by CDNs. Chrome's HTMLMediaElement is stitching fragments served for different URLs together (see the opening comment of https://code.google.com/p/chromium/issues/detail?id=532569 by strobe). Chrome's resource loader in general doesn't.
Given the situation, it seems we could document requirements for such an approach to make sure it's secure. It doesn't necessarily require all "fetching" on the web platform to do the stitching.
Doing it generically would indeed be very broken.
To do this for a specific application (e.g., HTMLMediaElement), you need a really explicit assertion that not only are the two resources equivalent, but also that the two specific representations are exactly the same -- e.g., ETag sharing. Even then, this is not something happening in HTTP -- it has to be built on top.
See: http://httpwg.github.io/specs/rfc7233.html#combining.byte.ranges http://httpwg.github.io/specs/rfc7234.html#combining.responses
Are we doing this @rocallahan?
When our media resource loader takes over an HTTP load, it uses the final post-all-redirects URI as its canonical URI for the resource. All subsequent range requests start with that URI; if further redirects occur, they are honoured. The principal(s) associated with the media data are gathered from all final-URIs. If these are different origins that's generally OK: we'll still play the media, though (since at least one of those origins must not be same-origin with the page) certain APIs will be affected (e.g. after drawing a video frame to a canvas, the canvas will be tainted).
I'm not familiar with the CDN setup described in https://code.google.com/p/chromium/issues/detail?id=532569, but I assume the CDN has a canonical URI which redirects quasi-randomly to one of many mirror URIs, and the mirror URIs never do any more redirects. If so, then by using the final URI from the first load for every subsequent range request we're avoiding any issues.
Okay, so it sounds like the HTML standard would need to do this for media elements. @foolip, have you looked into doing this? It would perhaps also require some overrides then to make sure Fetch does not do anything bad upstream.
I haven't given this any thought in the spec, no. What I do know is that media elements integrate with the network layer in a rather unique way, that seems to be true of all implementations, and certainly was in Presto.
The problem of knowing that the resource is the same when requesting a second range isn't unique to redirects, even when the same server responds you in principle need some sanity checks. I doubt that these are interoperable today, and I doubt even more that doing the strict checks that would actually make sense (ETag) would really be web compatible.
Just to make sure, the proposal by @rocallahan is that once the UA receives any body bytes back from the server, it stops following further redirects?
Seems the model doesn't work for some CDNs. See this post by strobe@ from YouTube https://code.google.com/p/chromium/issues/detail?id=532569#c33
Just to make sure, the proposal by @rocallahan is that once the UA receives any body bytes back from the server, it stops following further redirects?
Sorry, I thought I was pretty clear and I'm not sure how to make it clearer:
All subsequent range requests start with that URI; if further redirects occur, they are honoured.
...
Seems the model doesn't work for some CDNs. See this post by strobe@ from YouTube https://code.google.com/p/chromium/issues/detail?id=532569#c33
That seems to be based on a misunderstanding of what I said.
I wanted to make sure I'm understanding what you said in the second paragraph correctly. It was my mistake that I referred to the paragraph by "proposal".
Thanks for replying to the crbug thread.
https://jewel-chair.glitch.me/same-origin.html
<audio>
that points to /audio-redirect-second-part
.Range
that starts at an offset other than 0, the server redirects to /audio-normal
.Chrome: Observes the redirect. Subsequent requests go to /audio-normal
.
Firefox: Observes the redirect. Subsequent requests go to /audio-redirect-second-part
.
https://jewel-chair.glitch.me/same-origin-immediate-redirect.html
<audio>
that points to /audio-redirect-first-part
./audio-normal
.Chrome: Observes the redirect. Subsequent requests go to /audio-normal
.
Firefox: Observes the redirect. Subsequent requests go to /audio-normal
.
I'm looking to spec the correct behaviour here, and I'd like to do the same for other range requests like downloads.
Initially, the Firefox behaviour seems inconsistent. But, if a browser were to request multiple ranges in parallel, Chrome's behaviour could be racey.
I'm not familiar with the CDN pattern @tyoshino mentioned. Are there any further details? Do these CDNs tend to redirect for the initial range, or do they perform multiple redirects for different parts of the media resource?
Range
is already allowed to be set by media elements due to https://fetch.spec.whatwg.org/#unsafe-request-flag. Not necessarily great as it allows poking holes in the same-origin policy (see also #568), but that is how it is.
@horo-t @mikewest it seems Chrome has the strictest handling of media element range requests thanks to your efforts:
Given that rather weird behavior it seems we might be able to outlaw redirects for subsequent requests completely. This would also help https://github.com/annevk/orb, though it does not matter much. Is there a reason they are allowed? And if not, are you interested in simplifying that logic?
cc @padenot @anforowicz
If we can get away with dropping redirects entirely, I'd be happy too. @jakearchibald might have more context on how we landed on the current behavior?
IIRC, the subsequent redirects are sometimes used to reauthenticate the resource. I.e., you watch a video for some time and then walk away for a couple hours, upon clicking play again the provider may need to reauthenticate your session (for content license requirements) which may redirect through some validation before going back to the final redirected URL.
Chrome has some funky behavior around HTMLMediaElement + redirected range requests.
https://codereview.chromium.org/1220963004 denied responses to range requests if their origin is distinct from the origin response for the initial request.
https://codereview.chromium.org/1356353003 relaxes that restriction to accept responses to range requests if they're CORS-same-origin with the origin response from the initial request. It also treats "range" as a simple header for the purposes of preflights if the request is CORS enabled (e.g.
<video crossorigin ...>
).It would be nice to spec this out in a sane way. :)