Rob--W / cors-anywhere

CORS Anywhere is a NodeJS reverse proxy which adds CORS headers to the proxied request.
MIT License
8.69k stars 6.13k forks source link

Does CORS-Anywhere work with URLs that are "protected" by web access control products like Oracle OAM, CA Sitemender, etc.? #378

Open ohaya opened 3 years ago

ohaya commented 3 years ago

Hi,

We have a number of situations where our users use (XHR/Fetch) clients to access resources (URLs) that are on different domains, and where those resources are "protected" by something like a "web agent" (e.g., Oracle OAM webgate, CA Siteminder webagent, etc.). These web agents typically use redirects to cause the incoming browser request to produce a request to a different URL, which then communicates with the web access control product's server, so something like, in the case of these XHR clients:

XHR client (in browser) ==> Request to protected URL (in a different domain than the server that served the client code) Access Product Web agent ==> Sends 302/redirect to client to a different Access product endpoint XHR client follows the redirect (this request would have "Origin: null" due to the redirect) Access product server consumes the request, "authenticates" the user, and sends 302/redirect to client, together with some Set-Cookie XHR client ==> Request to protected URL but with Access product cookies

The above flow is somewhat high-level, but would a CORS-Anywhere server work with this scenario?

Thanks, Jim

ohaya commented 3 years ago

Hi,

I just found this on the help on the demo page:

Redirects are automatically followed. For debugging purposes, each followed redirect results
in the addition of a X-CORS-Redirect-n header, where n starts at 1. These headers are not
accessible by the XMLHttpRequest API.
After 5 redirects, redirects are not followed any more. The redirect response is sent back
to the browser, which can choose to follow the redirect (handled automatically by the browser).

But the README.md on the github project page says

This package does not put any restrictions on the http methods or headers, except for cookies. Requesting user credentials is disallowed. The app can be configured to require a header for proxying a request, for example to avoid a direct visit from the browser.

The protocols for the web access control products also rely on sending cookies and also query parameters during the authentication process, so do you think the out-of-box CORS-Anywhere would work?

Also I wanted to test, using your demo, but when entering the URL to the demo I am getting this:

GET http://test.whatever.com:7777/target/index.html
404 Not Found

Not found because of proxy error: Error: getaddrinfo ENOTFOUND test.whatever.com

Is that because, to use the demo, that your demo needs to be able to resolve the hostname in the URL that we enter?

Thanks, Jim

EDIT: I should mention that the "test.whatever.com" hostname is a hostname that is in the c:\windows\system32\drivers\etc\hosts file of the Windows workstation that I am running the browser from. I was hoping that the hostname in the URL that I entered into the demo page would get resolved by that hosts file, but it sounds like the hostname actually has to be resolvable by (maybe) your demo server itself? Is that the case?

Also, can an IP address be used in the URL that is entered into the demo page? Or, must it be a FQDN?

Rob--W commented 3 years ago

The protocols for the web access control products also rely on sending cookies and also query parameters during the authentication process, so do you think the out-of-box CORS-Anywhere would work?

No. It is not secure to enable cookies when the proxy is used to access multiple websites.

EDIT: I should mention that the "test.whatever.com" hostname is a hostname that is in the c:\windows\system32\drivers\etc\hosts file of the Windows workstation that I am running the browser from. I was hoping that the hostname in the URL that I entered into the demo page would get resolved by that hosts file, but it sounds like the hostname actually has to be resolvable by (maybe) your demo server itself? Is that the case?

A third-party server cannot look in your local hosts file. That would be quite a security issue on your end. CORS Anywhere is a public proxy that can only access publicly accessible resources. If you host CORS Anywhere within your intranet, then your instance would also be able to access those resources. But be very careful with access control: any website on a client in your network can then read any public (as in available without further authentication) resource within the network.

Also, can an IP address be used in the URL that is entered into the demo page? Or, must it be a FQDN?

An IP address or host name is valid. The list of valid TLDs is stored in https://github.com/Rob--W/cors-anywhere/blob/master/lib/regexp-top-level-domain.js

ohaya commented 3 years ago

Hi,i I have started testing now with a test scenario, where my Javascript/XHR app is using the CORS Anywhere double URL to access a resource/URL that is hosted in a different domain and the resource is protected by an OAM webgate. I wasn't sure if I should put this post in this issue, or in the other "closed" issue, but decided it might fit better here?

This test is using CORS Anywhere that is deployed on one of my test servers.

Before I started testing with the protected resource, I have an almost identical "unprotected" test setup where the Javascript/XHR (in xhrtest/xhr-fakewava.html) is accessing a resource that is NOT protected, and when I test with this "unprotected" setup, the test works, i.e., the Javascript/XHR is able to retrieve the resource, using URL:

http://192.168.xxx.yy:8080/http://fakewava.whatever.com:7777/wavatarget/index.html

So then I made a new target resource, "wavatarget-charlieeastweb05/index.html" that is hosted on a machine that has an OAM webgate.

I use an almost identifical HTML page with the Javascript/XHR, "xhrtest/xhr-fakewava-protectedpage.html". The only difference is the double-URL is different:

http://192.168.157.23:8080/http://charlieeastweb05......com:7777/wavatarget-charlieeastweb05/index.html

However, when I use the page with the XHR pointing to the protected resource, I get a 404 error, and in the browser web developer=>network=>Response, it has the following message:

Not found because of proxy error: Error: self signed certificate in certificate chain

I was searching the Issues and found issue 123, that mentions the same error, from that thread, it looks like that problem was fixed awhile ago?

As I mentioned above, with a WAM like OAM, when a resource is protected, and a request is made for the resource, OAM will cause a 302/redirect, and in fact, in the Apache access_log, the last request I see shows a 302 response and the Location is set to one of the OAM endpoints:

"+++LOCATION+++++ https://charlieeastweb04....com:14430/oam/server/.... +++++++++++++"

I am guessing that when I do this test (XHR accessing protected resource), the browser is being re-directed to that OAM URL and then the error that is being shown in the browser web developer=>network=>Response occurs (the "self signed certificate in certificate chain"), but I not sure why that would happen, because when I point the same browser directory to the protected resource URL, I get a cert popup and after selecting a certificate, I can access the page.

When that error occurs, can you tell me which component is getting the error? Is it the CORS Anywhere itself? Also which certificate chain is that error referring to?

Jim

EDIT: I just did another test where I just used the demo web app (on my system) and pointed it to the same URL:

http://charlieeastweb......gxaws.com:7777/wavatarget-charlieeastweb05/index.html

and I also got a 404 and the same error text in the demo web app text box.

I was wondering if you could suggest where I might try to put some debug code, e.g., in the server.js or in the cors-anywhere itself?

EDIT 1: FYI, I found this:

https://stackoverflow.com/questions/45088006/nodejs-error-self-signed-certificate-in-certificate-chain

and, only temporarily, I tried the suggestion of adding the

export NODE_TLS_REJECT_UNAUTHORIZED='0'

before starting the server.js, and it looks like I got further and I don't get that error, but am having some other problem (it doesn't look like I am authenticating successfully to OAM), so I need to figure out what is going on with that now...

ohaya commented 3 years ago

Hi Rob,

I think I almost have CORS Anywhere working with a test OAM scenario, but:

I currently am still having to do the "export NODE_TLS_REJECT_UNAUTHORIZED='0'" to avoid the "self-signed certificate in chain" problem. Is it possible to tweak the server.js or the CORS Anywhere code to import one of our CA certs so that I don't have to do that export?

I am not 100% sure yet, but for my test with the protected resource, it is getting through the most of the flow, but I am still getting an "ENOENT"/404 error at the end. That error SEEMS to be saying that there is a problem with the hostname, but I stood up a new DNS server for this testing.

OAM tends to return a 404 error when authentication fails, so I don't know for sure if the 404 error is because of an authentication error, or if there is because of something else like the name resolution.

Another possibility is that the problem may be that cookies that are normally created as part of the OAM authentication (and which are used for authorization) are gone. Actually at the end, the browser doesn't seem to have any cookies at all. The lack of those cookies could also be causing the 404 error response.

Is there any way that I can modify the server.js (or maybe something else), to NOT drop the cookies? For example I noticed this snippet in the server.js:

var cors_proxy = require('./lib/cors-anywhere');
cors_proxy.createServer({
  originBlacklist: originBlacklist,
  originWhitelist: originWhitelist,
  requireHeader: ['origin', 'x-requested-with'],
  checkRateLimit: checkRateLimit,
  removeHeaders: [
    'cookie',
    'cookie2',
    // Strip Heroku-specific headers
    'x-request-start',
    'x-request-id',
    'via',
    'connect-time',
    'total-route-time',
    // Other Heroku added debug headers
    // 'x-forwarded-for',
    // 'x-forwarded-proto',
    // 'x-forwarded-port',
  ],

If I removed the:

   'cookie',
    'cookie2',

Would that allow the cookies to not be dropped?

Please advise.

Thanks, Jim

EDIT: After much more testing, last night and today, I am starting to feel like the redirects that are supposed to happen when the request goes to the protected resource, are not even happening :(... The reason that I am starting to think this is:

  1. I have tried several using several sniffers (wireshark, tcpdump), the browser web developer tool, and also Fiddler, and NONE of them are showing any requests after the request to the protected resource, and there is nothing showing any redirects
  2. I have my test protected URL configured for certificate authentication, so as part of the normal processing after hitting the protected resource, the OAM webgate would cause the browser to redirect to another URL to collect credentials, and a cert popup window would appear to allow selecting which client cert to use for the authentication. However during testing with the protected resource, I am not seeing any cert popup.

Do you have any idea why the redirects might not be occurring?

EDIT 2:

For comparison, here's a screenshot of the web developer=>Network for a test request where I pointed the browser directly to a protected resource (the cgi-bin/printenv on an Apache):

image

As you can see, there are 4 302/redirects (due to the webgate), followed by the final 200/OK.

Then, I used the same URL, but put it into the demo web text box and here is what the web developer=>Network looks like:

image

This time, there is only one request showing, with a 200/OK response... From the text in the left pane, the response page was an error page when the authentication failed.

I read the help page, which says that it should be able for follow 5 redirects:

Cookies are disabled and stripped from requests.

Redirects are automatically followed. For debugging purposes, each followed redirect results
in the addition of a X-CORS-Redirect-n header, where n starts at 1. These headers are not
accessible by the XMLHttpRequest API.
After 5 redirects, redirects are not followed any more. The redirect response is sent back
to the browser, which can choose to follow the redirect (handled automatically by the browser).

The requested URL is available in the X-Request-URL response header.
The final URL, after following all redirects, is available in the X-Final-URL response header.

So I am puzzled why the redirects do not seem to be happening? What could cause the redirects not to be followed?

EDIT 3: I was re-reviewing the test that I did where I provided the screen shots above and for the one where there were 4 302/redirects, I wanted to mention that the initial request was http, but 2 of the redirects were to https (and one of the 2 is actually looking for a 2-way SSL handshake to get the user's client cert). Then I found this older issue/post:

https://github.com/Rob--W/cors-anywhere/issues/27#issuecomment-108632963

and I was wondering if you think that any of the 5 suggestions you made might help me?

In particular I am thinking of this one:

Self-host CORS Anywhere, disable the xfwd option (see server.js) and add X-Forwarded-Proto to the removeHeaders list.

ohaya commented 3 years ago

FYI, after re-examining some pcap files that I captured earlier, I am seeing "hints" that the redirects are actually occurring. I don't see (yet) the actual redirected requests themselves, but I am seeing the "X-CORS-Redirect-1" etc. response headers in one of the responses and also the "X-final-url" header.

I am guessing that the reason that I don't see the actual requests corresponding to those URLs is that I haven't configured Wireshark to decrypt the SSL yet, which I am attempting to do now.

I gather that the "x-final-url" means that is the final redirect in the chain of redirects? If so, the URL in that "x-final-url" header should not be the last URL in the chain of redirects (there should be more non-SSL redirects after the 2 SSL redirects that I see now).

Jim

EDIT: FYI, I have configured Wireshark for SSL decryption, and unfortunately the actual missing request/responses are still not appearing in Wireshark.

ohaya commented 3 years ago

Hi Rob,

FYI, I wanted to update my situation:

Thanks, Jim

Rob--W commented 3 years ago

If I removed the:

   'cookie',
    'cookie2',

Would that allow the cookies to not be dropped?

The cookie would not be dropped, but cookies are still stripped in the library. This is hard-coded at https://github.com/Rob--W/cors-anywhere/blob/70aaa22b3f9ad30c8566024bf25484fd1ed9bda9/lib/cors-anywhere.js#L213-L215 It is not secure at all to remove this, because it can result in leakage of credentials between proxied websites. I strongly advice against it, as I mentioned at https://github.com/Rob--W/cors-anywhere/pull/154#issuecomment-468649353

ohaya commented 3 years ago

Hi Rob,

Here's an update. I determined that the reason I wasn't able to see most of the request/response pairs before was because our dev environment is on AWS, and promiscuous monitoring doesn't work on AWS, so I have now put together a test environment that is running under VirtualBox.

I think I now have a scenario that is almost close to the scenario that we were having earlier, and I have been able to capture packet captures.

Even to get to this point, I had to add some Header directives in a in my Apache, because requests were coming in with "Origin" request headers, but the responses did not have the CORs response headers.

It also looks like there are two places where there are requests with "Origin" headers with values, where the response is a 401.

The requests that correspond to those 2 401 responses both have an "Origin" header, but one of the 401 responses has an "Access-Control-Allow-Origin" response header, and the other 401 response does not have an "Access-Control-Allow-Origin" response header.

I think that because the request with the response without the ACAO response header is causing that 401 response to be blocked, and that is causing the the authentication to fail (this scenario is using BASIC authentication).

Thus far, I cannot fix those last 2 using the Header directives, because those URLs are going directly to the WebLogic/OAM server.

Would it be all right to send you the PCAP file?

Thanks, Jim

EDIT: To be clear, because the 2 401 responses are being blocked, the rest of the protocol doesn't even happen, so there is more requests/response pairs that I still have not seen yet.

ohaya commented 3 years ago

Hi,

I was able to find a different (what Oracle calls) "authentication scheme", which doesn't need redirects, so I changed the protection on the target URL in OAM to use that authentication scheme. This authentication scheme is using HTTP BASIC authentication (where you get a popup window to enter username and password).

When I tested going directly (using a browser) to that protected resource, sure enough there are no redirects. I get the BASIC popup, enter my username and password, and then the browser receives the protected page.

So I changed my test so that my Javascript/XHR does a GET on that protected URL with the CORS Anywhere URL (http://xxx:8080/) pre-pended to the protected URL.

However when I test that, I don't get the Basic popup.

Looking at the wireshark capture, I see the 401 response that has the "www-authenticate: Basic realm=xxxx" response header, which is supposed to be what causes the browser to present the popup window, so I've been looking at the 401 response when using the javascript/xhr and CORS Anywhere vs. going directly to the protected URL using a browser.

Right now, the only thing that I see is:

1) 401 Response when request to protected resource is using Javascript/XHR:

Frame 355: 934 bytes on wire (7472 bits), 934 bytes captured (7472 bits) on interface \Device\NPF_{A65DD5E0-F324-4BF0-8115-255A8EC064BD}, id 0
Ethernet II, Src: PcsCompu_4e:e2:d7 (08:00:27:4e:e2:d7), Dst: PcsCompu_4d:6c:d9 (08:00:27:4d:6c:d9)
Internet Protocol Version 4, Src: 192.168.0.106, Dst: 192.168.0.103
Transmission Control Protocol, Src Port: 7777, Dst Port: 56016, Seq: 1, Ack: 413, Len: 868
Hypertext Transfer Protocol
    HTTP/1.1 401 Unauthorized\r\n
    Date: Sun, 03 Oct 2021 13:04:01 GMT\r\n
    Server: Apache/2.4.29 (Unix) OpenSSL/1.0.2k-fips\r\n
    Access-Control-Allow-Origin: http://centos-apache1.whatever.com:7777\r\n
    Access-Control-Allow-Credentials: true\r\n
    Access-Control-Allow-Methods: GET, POST, OPTIONS\r\n
    Access-Control-Allow-Headers: Origin, Content-Type, Accept\r\n

    WWW-Authenticate: Basic realm="ATNSCHEME-BasicSessionlessScheme"\r\n

    Content-Length: 381\r\n

    Connection: close\r\n

    Content-Type: text/html; charset=iso-8859-1\r\n
    \r\n
    [HTTP response 1/1]
    [Time since request: 10.115751000 seconds]
    [Request in frame: 188]
    [Request URI: http://centos-apache3.whatever.com:7777/oamprotectedtarget/index.html]
    File Data: 381 bytes
Line-based text data: text/html (12 lines)

2) 401 Response when request to protected resource is directly from browser to protected resource:

Frame 261: 889 bytes on wire (7112 bits), 889 bytes captured (7112 bits) on interface \Device\NPF_{A65DD5E0-F324-4BF0-8115-255A8EC064BD}, id 0
Ethernet II, Src: PcsCompu_4e:e2:d7 (08:00:27:4e:e2:d7), Dst: PcsCompu_a8:ad:d1 (08:00:27:a8:ad:d1)
Internet Protocol Version 4, Src: 192.168.0.106, Dst: 192.168.0.10
Transmission Control Protocol, Src Port: 7777, Dst Port: 55025, Seq: 1, Ack: 480, Len: 835
Hypertext Transfer Protocol
    HTTP/1.1 401 Unauthorized\r\n
    Date: Sun, 03 Oct 2021 13:06:29 GMT\r\n
    Server: Apache/2.4.29 (Unix) OpenSSL/1.0.2k-fips\r\n
    Access-Control-Allow-Credentials: true\r\n
    Access-Control-Allow-Methods: GET, POST, OPTIONS\r\n
    Access-Control-Allow-Headers: Origin, Content-Type, Accept\r\n

    WWW-Authenticate: Basic realm="ATNSCHEME-BasicSessionlessScheme"\r\n

    Content-Length: 381\r\n

    Keep-Alive: timeout=5, max=100\r\n
    Connection: Keep-Alive\r\n

    Content-Type: text/html; charset=iso-8859-1\r\n
    \r\n
    [HTTP response 1/1]
    [Time since request: 10.104103000 seconds]
    [Request in frame: 111]
    [Request URI: http://centos-apache3.whatever.com:7777/oamprotectedtarget/index.html]
    File Data: 381 bytes
Line-based text data: text/html (12 lines)

In the above, for the case where the request is from Javascript+XHR going through CORS Anywhere, to the protected resource, the 401 response has:

Connection: close

but when using a browser to go to the protected resource, the 401 response has:

    Keep-Alive: timeout=5, max=100\r\n
    Connection: Keep-Alive\r\n

I've been trying to configure the Apache that is hosting the protected URL (an Apache server). I can get the Apache to inject the "Keep-Alive: timeout=5, max=100" response header using the Apache "Header" directive, but it seems like there is no way to replace the "Connection: close" with "Connection: Keep-Alive" (I can ADD to the Connection header, but I cannot remove the "close").

The reason that I am posting this is that I cannot determine for sure where the "Connection" response header is coming from.

I don't think it is from the Apache that is hosting the target page, because that doesn't change between the 2 different cases.

So I am wondering if it is possible that that "Connection: close" response header is being set in the response by CORS Anywhere?

Thanks, Jim

ohaya commented 3 years ago

Hi Rob,

I just found this thread in SO:

https://stackoverflow.com/questions/18499465/cors-and-http-basic-auth

and specifically the response from "Brock Allen" on Aug 29, 2013:

"If you're requesting credentials then the server must respond with the specific origin in the Access-Control-Allow-Origin response header (and thus can't use the wildcard *). Of course it would then also need to respond with Access-Control-Allow-Credentials response header too."

And then I checked the 401 response that is going back to the browser in my Wireshark captures, and that 401 response does have:

access-control-allow-origin: *

So perhaps that (because of the *) may be preventing the browser from popping up the login window?

I am not 100% sure where that response header is coming from, but I'm guessing that it may be from CORS Anywhere?

If so, could CORS Anywhere be able to send back a header that doesn't have "*", but rather the value from the original "Origin" request header?

FYI, here is the request captured:

Frame 165: 511 bytes on wire (4088 bits), 511 bytes captured (4088 bits) on interface \Device\NPF_{A65DD5E0-F324-4BF0-8115-255A8EC064BD}, id 0
Ethernet II, Src: PcsCompu_a8:ad:d1 (08:00:27:a8:ad:d1), Dst: PcsCompu_4d:6c:d9 (08:00:27:4d:6c:d9)
Internet Protocol Version 4, Src: 192.168.0.10, Dst: 192.168.0.103
Transmission Control Protocol, Src Port: 52934, Dst Port: 8080, Seq: 1, Ack: 1, Len: 457
Hypertext Transfer Protocol
    GET /http://centos-apache3.whatever.com:7777/oamprotectedtarget/index.html HTTP/1.1\r\n
    Host: centos-apache1.whatever.com:8080\r\n
    Connection: keep-alive\r\n
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36\r\n
    Accept: */*\r\n
    Origin: http://centos-apache1.whatever.com:7777\r\n
    Referer: http://centos-apache1.whatever.com:7777/\r\n
    Accept-Encoding: gzip, deflate\r\n
    Accept-Language: en-US,en;q=0.9\r\n
    \r\n
    [Full request URI: http://centos-apache1.whatever.com:8080/http://centos-apache3.whatever.com:7777/oamprotectedtarget/index.html]
    [HTTP request 1/1]
    [Response in frame: 181]

and here's the 401 response (to the BROWSER):

Frame 181: 1322 bytes on wire (10576 bits), 1322 bytes captured (10576 bits) on interface \Device\NPF_{A65DD5E0-F324-4BF0-8115-255A8EC064BD}, id 0
Ethernet II, Src: PcsCompu_4d:6c:d9 (08:00:27:4d:6c:d9), Dst: PcsCompu_a8:ad:d1 (08:00:27:a8:ad:d1)
Internet Protocol Version 4, Src: 192.168.0.103, Dst: 192.168.0.10
Transmission Control Protocol, Src Port: 8080, Dst Port: 52934, Seq: 1, Ack: 458, Len: 1268
Hypertext Transfer Protocol
    HTTP/1.1 401 Unauthorized\r\n
    x-request-url: http://centos-apache3.whatever.com:7777/oamprotectedtarget/index.html\r\n
    date: Mon, 04 Oct 2021 13:16:56 GMT\r\n
    server: Apache/2.4.29 (Unix) OpenSSL/1.0.2k-fips\r\n
    access-control-allow-origin: *\r\n
    access-control-allow-credentials: true\r\n
    access-control-allow-methods: GET, POST, OPTIONS\r\n
    access-control-allow-headers: Origin, Content-Type, Accept\r\n
    keep-alive: timeout=7, max=100\r\n
    www-authenticate: Basic realm="ATNSCHEME-BasicSessionless"\r\n
    content-length: 381\r\n
    connection: close\r\n
    content-type: text/html; charset=iso-8859-1\r\n
    x-final-url: http://centos-apache3.whatever.com:7777/oamprotectedtarget/index.html\r\n
     [truncated]access-control-expose-headers: date,server,access-control-allow-origin,access-control-allow-credentials,access-control-allow-methods,access-control-allow-headers,keep-alive,www-authenticate,content-length,connection,content-ty
    \r\n
    [HTTP response 1/1]
    [Time since request: 0.008985000 seconds]
    [Request in frame: 165]
    [Request URI: http://centos-apache1.whatever.com:8080/http://centos-apache3.whatever.com:7777/oamprotectedtarget/index.html]
    File Data: 381 bytes
Line-based text data: text/html (12 lines)

So if that access-control-allow-origin header is from CORS Anywhere, could somehow CORS Anywhere be able to send back:

access-control-allow-origin: http://centos-apache1.whatever.com:7777\r\n

instead of:

access-control-allow-origin: http://centos-apache1.whatever.com:7777\r\n

??

Thanks, Jim

EDIT: It looks like the access-control-allow-origin header is being set to "*" here in the code:

function withCORS(headers, request) {
  headers['access-control-allow-origin'] = '*';
  var corsMaxAge = request.corsAnywhereRequestState.corsMaxAge;
  if (request.method === 'OPTIONS' && corsMaxAge) {
    headers['access-control-max-age'] = corsMaxAge;
  }
  if (request.headers['access-control-request-method']) {
    headers['access-control-allow-methods'] = request.headers['access-control-request-method'];
    delete request.headers['access-control-request-method'];
  }
  if (request.headers['access-control-request-headers']) {
    headers['access-control-allow-headers'] = request.headers['access-control-request-headers'];
    delete request.headers['access-control-request-headers'];
  }

  headers['access-control-expose-headers'] = Object.keys(headers).join(',');

  return headers;
}