Rob--W / cors-anywhere

CORS Anywhere is a NodeJS reverse proxy which adds CORS headers to the proxied request.
MIT License
8.69k stars 6.13k forks source link

Endless redirect on HTTPS site #27

Open ptgolden opened 9 years ago

ptgolden commented 9 years ago

First of all, thank you for making your proxy public. It's been a great help in a browser-based application I'm working on that needs to consume linked data. I'm going to talk to my colleagues to see if we can do anything to help with #25.

I noticed a problem today where requesting a certain resource would result an an endless redirect to the same Location as the original request. Here's a test on test-cors.org. This endless redirect does not occur when requesting from the non-HTTPS proxy.

Rob--W commented 9 years ago

This issue is caused by an incorrectly configured server (orcid.org). When orcid.org is accessed over https, it redirects to http. This is bad security practice, but it is quite common. What's worse is that when it receives a X-Forwarded-Proto: https request header, it also redirects to http. This means that the site tries really hard to prevent it from being used over https (directly and proxied).

curl https://orcid.org/0000-0002-3617-9378 -I
curl http://orcid.org/0000-0002-3617-9378 -H 'X-Forwarded-Proto: https' -I

Response:

HTTP/1.1 301 Moved Permanently
Server: nginx/1.1.19
Content-Type: text/html
Date: Wed, 03 Jun 2015 22:14:53 GMT
Location: http://orcid.org/0000-0002-3617-9378
Connection: keep-alive
Set-Cookie: X-Mapping-....redacted...; path=/
Content-Length: 185

There are several solutions:

  1. Contact the administrator of orcid.org and ask them to fix their server configuration.
  2. Access CORS Anywhere over http (not https). (caveat: Since CORS anywhere does not strip most headers, if any site uses CORS Anywhere with https AND the proxied site replies with a Strict-Transport-Security header, then browsers will automatically request https instead of http, and this suggestion wouldn't work).
  3. Self-host CORS Anywhere, disable the xfwd option (see server.js) and add X-Forwarded-Proto to the removeHeaders list.
  4. Modify CORS Anywhere to detect such redirect loops and automatically strip the xfwd headers.

2 is the easiest / practical option for you at the moment (but take note of the caveat), but 3 is recommended if you want to maintain a server.

ptgolden commented 9 years ago

Great, thank you. I ended up hosting my own server.

I had to hack around a bit to get the server to work in a subdirectory. Would support for that be something you would be interested in adding? If so, I can make a pull request.

Rob--W commented 9 years ago

What do you mean by "getting the server to work in a subdirectory?"

ptgolden commented 9 years ago

Sorry, I meant setting it up to serve from a URL that included a pathname, e.g. https://example.com/cors-anywhere/

The base proxy URL is always set to only include the host name (https://github.com/Rob--W/cors-anywhere/blob/master/lib/cors-anywhere.js#L275), so redirects would always go back to the hostname without the trailing path.

ptgolden commented 9 years ago

To be clear, it was only a problem when a requested resource would redirect.

Rob--W commented 9 years ago

Sorry, I meant setting it up to serve from a URL that included a pathname, e.g. https://example.com/cors-anywhere/

The base proxy URL is always set to only include the host name (https://github.com/Rob--W/cors-anywhere/blob/master/lib/cors-anywhere.js#L275), so redirects would always go back to the hostname without the trailing path.

I'd accept such a pull request. Currently, the first character ("/") is stripped before the rest of the program continues, but you can also add a new config option, and change https://github.com/Rob--W/cors-anywhere/blob/7a138b36ccd45d47da16c07672fde317276398aa/lib/cors-anywhere.js#L237 to something that takes a regex, function or something else in config to modify req.url before passing it to parseURL.

To be clear, it was only a problem when a requested resource would redirect.

The issue that you've reported is more specific: the requested resource behaved incorrectly, and always returned a redirect under certain circumstances (https://github.com/Rob--W/cors-anywhere/issues/27#issuecomment-108632963).

ptgolden commented 9 years ago

The issue that you've reported is more specific: the requested resource behaved incorrectly, and always returned a redirect under certain circumstances (#27 (comment)).

Yes, you're right. I was referring to the issue of ignoring the pathname of the proxy server, which only happened on redirects.

I'd accept such a pull request. Currently, the first character ("/") is stripped before the rest of the program continues, but you can also add a new config option, and change https://github.com/Rob--W/cors-anywhere/blob/7a138b36ccd45d47da16c07672fde317276398aa/lib/cors-anywhere.js#L237 to something that takes a regex, function or something else in config to modify req.url before passing it to parseURL.

Great, I'll submit something later today.