whatwg / fetch

Fetch Standard
https://fetch.spec.whatwg.org/
Other
2.11k stars 328 forks source link

Header to opt out of opaque redirect #601

Open jakearchibald opened 7 years ago

jakearchibald commented 7 years ago

Something like Access-Control-Allow-Visible-Redirect: *

This would make a redirect responses visible. If the request was cross-origin, it would still have to pass existing CORS checks, and would be filtered accordingly.

annevk commented 7 years ago

Rough duplicate of #75.

I don't think we should call it Access-Control as it's not really about CORS with redirects as we hide them same-origin as well (for cross-origin redirect resources you'd still need CORS of course).

Can you explain the use case and do you know to what extent Chrome is interested in implementing? And other browsers?

jakearchibald commented 7 years ago

The suggestion came from an internal team who wanted to act conditionally on the type of redirect they received. However, they may not have the server control they need to add this header anyway.

I was just adding it for completeness. Should be considered low priority unless we get demand.

rajkjag commented 7 years ago

In our specific case, the navigation preload request, is being DOS blocked and redirected to a CAPTCHA page. However the preload response resolves to a response with type "opaqueredirect" and no access to the redirect url (sent back in the Location: header).The server which performs the DOS block is not under our control and happens upstream.

While we can work around this specific case, it might be easier if ServiceWorker code on the client, has access to the redirect URL for error handling and recovery.

annevk commented 7 years ago

@rajkjag that would violate https://fetch.spec.whatwg.org/#atomic-http-redirect-handling.

rajkjag commented 7 years ago

The proposal on the table was to have server specify via a response header the list of headers to expose.

annevk commented 7 years ago

Okay, I got confused since you said the server wasn't under your control and it would be easier if service workers had access to the redirect URL.

rajkjag commented 7 years ago

I took a closer look at our setup; While the server that generates the captcha URL is a different server, it is being operated in a slave mode and we seem to have control on the final response and headers. Apologies on the confusion.

  1. Primary server gets requests.
  2. Contact DOS slave server, gets back indication on whether to block or redirect.
  3. Returns a redirect response with the redirect URL.

If we cannot implement a header and have the client respect it and expose the redirect URL, then our fallback option would be to embed the redirect URL inside the body of the redirect response (If and when the server determines it is safe to do so).

ibnesayeed commented 6 years ago

Being able to expose status code, location, and various other headers will be great when redirects are handled manually in a fetch. The name of this redirect mode is confusing and misleading. However, if we can somehow opt-out of the opaqueredirect then the name can be justified.

Our use case is related to web archiving. An important need in web archival replay systems is to avoid live-leakage (we call them zombies). When a web page is archived and replayed (for example in the Internet Archive's Wayback Machine or other web archives), it is very important that all the page requisites are served from the archive and not from the live web. However, depending on how the resource was referenced (i.e., absolute URL, absolute path, relative), it may resolve to an invalid location or the live web. To prevent it from happening, archival replay systems rewrite all the references before serving to the client. However, in some complex situations when resources references are generated using JavaScript, they might fail to fix them.

To prevent it from happening we created Reconstructive which allows intercepting all requests in a Service Worker and rerouting them back to the archive if they are going elsewhere. The goal is to free the server from performing any rewrites and deliver the original archived content which can then be fixed by the SW to ensure proper replay. This generally works, but there are many cases where the archive has captured redirect responses originally from the live pages and replays them back with the original Location header as seen at the time of archiving. If the location is an absolute URI (potentially using the domain of the original site, not the archive) then not being able to rewrite it from the SW before handling it over to the browser throws it out of the scope of the SW (https://github.com/oduwsdl/ipwb/issues/456). Current work around is to make sure such headers are rewritten on the server side, but we would like it to work without the server being smart.

annevk commented 6 years ago

@ibnesayeed does adding a header to all redirects count as the server being smart? Because we cannot just reveal this information as already discussed. It has to be opt-in.

ibnesayeed commented 6 years ago

@annevk opt-in mechanism would work for us.

felixfbecker commented 5 years ago

I'm implementing a REST API for contents in a git repository and would like to represent symlinks through redirects, so that the client has the ability to a) easily follow the (final) destination but also b) inspect the Location header manually if needed. This seems impossible at the moment, and it is very counter-intuitive because you'd expect to be able to handle redirects manually when specifying redirect: 'manual'. It would work if this was made possible through a new access control header.

slaneyrw commented 5 years ago

This is starting to become a problem with Single page applications and remote authentication providers. A lot of server side frameworks will send back a 302 in reaction to an authentication failure so the user can authenticate. If this is initiated from a fetch request we need to navigate the browser, not have the fetch follow the response.

With xmlHttpRequest we could tell the difference by inspecting the X-Requested-With header and, on the server, switch the response from 302 to another status code that isn't auto followed. There is no "hints" in the standard that we can use stop sending a 302, and the javascript cannot read the intended destination... it's lose-lose

annevk commented 5 years ago

You can still send an X-Requested-With header if you want, fetch() is no different in that respect. But yeah, there's probably enough that we should add something here for the server to opt into.

garygreen commented 5 years ago

We also have a use case for this in service workers - we would like to fetch a request and see if it was redirected and to where, then we clear any cache for where the browser will be redirected to before the service worker fetches that page and potentially returns with a cached / stale content. This is particulary important in a cache-first strategy.

Our particular flow is:

  1. POST request
  2. Server validates request and responds with a 301 Redirect with any errors / success message flashed to session (and will be rendered on the new page once requested)
  3. The Location redirect returned from the server should have it's cached cleared by the service worker, so that when the page is loaded by the server worker it will fetch from the network and won't show stale content.

Maybe I'm approaching this the wrong way though as I'm kinda confused what even is a opaqueredirect and why it's useful.

annevk commented 5 years ago

https://github.com/whatwg/fetch/issues/601#issuecomment-328533420 has a link that explains why we don't expose contents of redirects. opaqueredirect exists to allow redirect navigations to still work offline.

nemzes commented 5 years ago

Just adding to the list of "interest shown" in this feature.

Our use-case is very much as described in https://github.com/whatwg/fetch/issues/601#issuecomment-502667208 — we are implementing a generic authentication proxy wrapper (so it's not application-specific), which will occasionally respond with an oauth redirect when the auth token has expired. fetch() obviously can't identify these responses, so we have a conundrum.

slaneyrw commented 4 years ago

Ok, it's over 2 years since this was raised. Surely the stakeholders have had some say by now

ibnesayeed commented 4 years ago

Ok, it's over 2 years since this was raised. Surely the stakeholders have had some say by now

We are still waiting for this to make it into the specs and then be implemented. However, we know standardization is a slow process and things don't always happen the way (and when) we want them to.

annevk commented 4 years ago

@yutakahirano @youennf @ddragana thoughts?

yutakahirano commented 4 years ago

Clarification question:

This is not for general responses, but only for navigation responses available in service workers, right?

annevk commented 4 years ago

This header would be about exposing redirect responses to script in general (if the response is cross-origin it would also have to use CORS). You'd have to use redirect mode "manual" and the response would have to set the header.

slaneyrw commented 4 years ago

Question... if the mode is set to manual, why doesn't the Fetch response contain the actual 30x response, instead of playing this security hand waving exercise. If the original request is NOT cross-origin then I should be allowed to see the response.

The problem with the technology it replaces ( XmlHttpRequest ) is that you couldn't stop the follow, so we worked around it by not sending back a 30x, using another status code ( i.e. 40x ) and adding another header ( i.e. X-Location ). ClientSide code reads the response and manually set the navigator URI. This requires us to trust the referrer ( lol ) to work out cross-origin requests and handle appropriately.

Not allowing the client to read the redirect target is just making us do the same. This aspect of the spec is effectively pointless.

annevk commented 4 years ago

@slaneyrw https://github.com/whatwg/fetch/issues/601#issuecomment-328533420. (And no it's not pointless, it's complicated.)

slaneyrw commented 4 years ago

@slaneyrw #601 (comment). (And no it's not pointless, it's complicated.)

Granted.. in CROSS-ORIGIN requests. But where the same application that issues the request sends back a response I can't see the point in hiding the URL.... I already own both pieces of the puzzle. There MUST be a way of being able to tell the fetch sub-system what behaviour I want, maybe similar to CORS.

At the moment we are all bypassing these controls because they don't work in practice.

yutakahirano commented 4 years ago

I have no objection for introducing this but I'm not sure if we implement it in the near future.

determin1st commented 4 years ago

4

may this be considered Spaghetti-code then?

determin1st commented 4 years ago

Actually I may propose some custom implementation which will involve almost the same work as proposed by the issue starter. So, with the header the work should be done both on server (to send okay, you may redirect header) and on client as well (client would have to check opaque/not, compose new url and make a new fetch).

custom redirect

Accepted content-type is JSON and a plaintext value (not "secured" by the spec yet) is a redirect URL. Other results are considered error. I've made some test to test it:

http://raw.githack.com/determin1st/httpFetch/master/test-8/index.html

manual redirect wasted

@annevk if you keep on blocking my messages i would have to search w3c contacts. Because spec doesn't move anywhere (it moves nowhere) and it may be considered as private agreements if you continue.

annevk commented 4 years ago

@determin1st if you continue with the off-topic posts and rather rude manner of communication you'll be banned from the WHATWG.

determin1st commented 4 years ago

@annevk my comments wasnt off-topic, they are related. If you dont like them, that's not a tech reason to ban the author. If there were rudeness - I am sorry.

Let's concentrate on this option - redirect:manual. The way it is currently defined in the spec is wasted/screwed/meaningless. That is an assesment of your work, not a rudeness. Browser already follows the route prescribed by the spec, so, it's testable - i did the test - it doesn't work. You may prove it's working - create your test. If you need mine, I'll drop it.

Workarounds, workarounds..

Besides the method proposed by the issue starter, there were already valid methods posted. For example, returning custom HTTP STATUS 4xx - will work now, without further negotiations with WATWG and browser maker. I would call @slaneyrw my ally, because he operates on the same interest level as I do, but has more strenght to not to be rude.

Interest?

end users - app maker - browser maker - spec maker

(we all) - (me) - (googler) - (you)

Black and White

Let's concentrate on the main reason why - security. You did an example: https://fetch.spec.whatwg.org/#atomic-http-redirect-handling it says that:

Except for the last response URL, if any, a response’s URL list cannot be exposed to script

Let it be correct (without a test?). So, there is no URL for redirect:manual. Job's done, option blocked.

The restriction itself opens doors. My test above, shows one route, there will be others. Am I a hacker - security breaker? That's funny - You can't force users to security or not to quit security - He/She may drop their passwords anywhere anytime. You may only opt-in security by default and may warn user (in cost of user's hate) when he/she follows another road. That's the basics of incorrect restrictions - techincal, non-political.

The restriction itself closes back-compatibility with the old servers. Does it closes some security holes? Some old annoying problems with the old stuff? May I see some info about it?

Off-topic

If you don't understand and don't accept anything written above, I have nothing more to advance to this repo/spec. Last comment, just don't answer - decide it alone with the browser makers as my interest is opted-out together with the end user interest. But let it hand here, you may close the issue. Otherwise, let's consider all accepted decisions arrived from googlers who advanced from the end user's point as incorrect.

Regards

Sectimus commented 4 years ago

In my SPA, my server could return a 3xx response requiring the user to reauthenticate with a third party oauth provider. The fact that there is no way for me to simply read the 'location' header and intercept the fetch redirect to do a real redirect instead has singlehandled put me back to using nasty XMLHttpRequests. :'( I thought we've moved on from this!

tonyhb commented 4 years ago

Bumping this - it's very surprising that manual redirect modes in fetch aren't manual in that you can't ever find out where you're being redirected to.

Time to go back to XMLHttpRequest yet again (necessary for request progress also...)

annevk commented 4 years ago

It's no different with XMLHttpRequest (that basically invokes fetch).

mnot commented 4 years ago

@annevk just curious -- does the threat model change if it's a same-site redirect?

annevk commented 4 years ago

@mnot I think at this point the Location header ~= HttpOnly cookie.

tonyhb commented 4 years ago

Ah. XMLHttpRequest changes the responseURL property after following redirects - https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/responseURL - so we can at least get some sort of hacky attempt if we need to add headers when the request is redirected. Check for a 4XX, check that the URL is different, re-request. :)

Since you mentioned it, looks like Response.url changes too, so I can use fetch for this redirect hack. Thanks!

kuzvac commented 4 years ago

If you use fetch, and request is success, you can determine that a redirect has occurred by redirected flag in fetch result. Url in fetch result is changed too.

But if you have more than one redirect in fetch roundtrip, that mean you can't get intermediate url address, only last.

Sectimus commented 4 years ago

If you use fetch, and request is success, you can determine that a redirect has occurred by redirected flag in fetch result. Url in fetch result is changed too.

But if you have more than one redirect in fetch roundtrip, that mean you can't get intermediate url address, only last.

Not really possible if the redirect was to a third party origin. so cors kicks in and you then cant read the url it attempted to reach cross origin. Even though it can be viewed in the network debug tab in devtools.

slaneyrw commented 4 years ago

If you use fetch, and request is success, you can determine that a redirect has occurred by redirected flag in fetch result. Url in fetch result is changed too. But if you have more than one redirect in fetch roundtrip, that mean you can't get intermediate url address, only last.

Not really possible if the redirect was to a third party origin. so cors kicks in and you then cant read the url it attempted to reach cross origin. Even though it can be viewed in the network debug tab in devtools.

Shouldn't matter is the target is a third party ( think OpenId/Connect), if the system initiating the redirect is same site, then we should be able to read the destination header

slaneyrw commented 4 years ago

In my SPA, my server could return a 3xx response requiring the user to reauthenticate with a third party oauth provider. The fact that there is no way for me to simply read the 'location' header and intercept the fetch redirect to do a real redirect instead has singlehandled put me back to using nasty XMLHttpRequests. :'( I thought we've moved on from this!

Nope, can't return 3xx when making "ajax" requests. Nothing you can return in XmlHttpRequest or fetch can be read by the browser. At least with XmlHttpRequest the server can tell by looking at the X-Requested-with header and change th er response from 3xx to something else.

This is now causing work work-arounds in jQuery, fetch, angular, react, react-native, etc.

WhatWG ppl, are we listening yet or do you still think this is a reasonable course of action?

maapteh commented 3 years ago

Also bumping this. What's the point of having manual while not knowing the redirect location.

annevk commented 3 years ago

It's for service workers to be able to deal with responses to navigation requests (which are handled manually by the navigate algorithm). Explained under https://fetch.spec.whatwg.org/#concept-request-redirect-mode.

maapteh commented 3 years ago

Thanks for explaining @annevk, that part is clear, but even in normal curl i can have at least have Location: https://SNIP in the response. I thought i could simply put part of an external api into our own NodeJS part, but now i dont know if the response of the override is success or error since that info is in Location :) So i have to solve it in my service in a real language now.

annevk commented 3 years ago

What I linked also points to https://fetch.spec.whatwg.org/#atomic-http-redirect-handling which is also discussed earlier in the thread. Basically, the value of a Location header can about as sensitive as an HttpOnly cookie.

slaneyrw commented 3 years ago

It's for service workers to be able to deal with responses to navigation requests (which are handled manually by the navigate algorithm). Explained under https://fetch.spec.whatwg.org/#concept-request-redirect-mode.

Ok, so the spec was updated in June 2020 (after my last comment) to appear to give special right to service workers. It's still not clear how that should work.

If you look at the section on https://fetch.spec.whatwg.org/#concept-filtered-response-opaque-redirect, it mentions the URL List is harmless.

So do we now have conflicting statements whether reading the location to redirect to is available or not ? Why does service workers get special rights for this ? Do we now need to use service workers exclusively to make fetch calls ?

Or have I completely misread how this is supposed to work

Just to re-iterate my scenario.

SPA javascript application initiates fetch call back to the SAME domain for some sort of resource. Server evaluates that an authorization is required and sends back a 301 response with the location the SPA is supposed to navigate to ( i.e. redirect to OpenId/connect authorize endpoint ). SPA framework navigates browser to intended destination

annevk commented 3 years ago

You have misread it. The URL list of a response doesn't contain the Location header information. In case of an opaque-redirect response it would only contain the request URL, which you already have. This feature also was only ever meant for service workers so they can store an opaque redirect response and replay it (but they won't ever be able to obtain the value from the Location header either).

slaneyrw commented 3 years ago

You have misread it. The URL list of a response doesn't contain the Location header information. In case of an opaque-redirect response it would only contain the request URL, which you already have. This feature also was only ever meant for service workers so they can store an opaque redirect response and replay it (but they won't ever be able to obtain the value from the Location header either).

ok, thank you for the clarification... but still pointless then. The workarounds continue!

mitar commented 3 years ago

Could maybe spec allow accessing Location header if the redirect target is on the same origin? Or even if it is absolute URL (one starting with /, without hostname, etc.)?

annevk commented 3 years ago

Not without opt-in: https://github.com/whatwg/fetch/issues/601#issuecomment-614474515.

freewheel70 commented 3 years ago

Basically, the value of a Location header can about as sensitive as an HttpOnly cookie

Hi @annevk Could you explain more about why Location header is so sensitive ?

In case of 302, the Location header contains the value of the temporary redirection, right ? So I guess its value is not that security-critical.

slaneyrw commented 3 years ago

Basically, the value of a Location header can about as sensitive as an HttpOnly cookie

Hi @annevk Could you explain more about why Location header is so sensitive ?

In case of 302, the Location header contains the value of the temporary redirection, right ? So I guess its value is not that security-critical.

Especially if it's same origin, absolutely not a security hole.

But there are implementations in the wild so if the spec was changed to allow opt-in ( or even just change the default implementation ) we still cannot rely on the behaviour and there is no workaround. I'm afraid that this spec is basically worthless.

It's been over 4 years and this is still not "fixed"... the world has moved on and we work around the problem by sending make other response status codes that do not have this arbitrary limitation.