Rob--W / cors-anywhere

CORS Anywhere is a NodeJS reverse proxy which adds CORS headers to the proxied request.
MIT License
8.57k stars 5.99k forks source link

Still problems with CORS-Anywhere (deployed to my server) #383

Closed ohaya closed 2 years ago

ohaya commented 2 years ago

Hi,

I have deployed CORS-Anywhere to one of my servers. The environment has a DNS server that I host myself.

The server has both Apache and node on it. The server's IP is 192.168.xxx.yy.

The CORS-Anywhere server.js is running under the node, listening on port 8080.

The Apache is listening on port 7777. I have an HTML page that I serve from the Apache at /xhrtest/xhr-fakewava.html (see below).

I am doing a request to: http://192.168.xxx.yy:8080/http://fakewava.whatever.com:7777/xhrtest/xhr-fakewava.html

Webpage after response: "Missing required request header. Must specify one of: origin,x-requested-with"

Console shows: GET http://192.168.xxx.yy:8080/http://fakewava.whatever.com:7777/xhrtest/xhr-fakewava.html 400 (Header required)

However, the xhr-fakewava.html looks like (i.e., I already have the xhr.setRequestHeader() in it):

<script>
// 1. Create a new XMLHttpRequest object
let xhr = new XMLHttpRequest();

// 2. Configure it: GET-request for the URL /article/.../load
//xhr.open('GET', 'http://fakewava.whatever.com:7777/wavatarget/index.html');
xhr.open('GET', 'http://192.168.xxx.yy:7777/wavatarget/index.html');
xhr.setRequestHeader('X-Requested-With', 'XMLHttpRequest');

// 3. Send the request over the network
xhr.send();

// 4. This will be called after the response is received
xhr.onload = function() {
  if (xhr.status != 200) { // analyze HTTP status of the response
    alert(`Error ${xhr.status}: ${xhr.statusText}`); // e.g. 404: Not Found
  } else { // show the result
    alert(`Done, got ${xhr.response.length} bytes`); // response is the server response
  }
};

xhr.onprogress = function(event) {
  if (event.lengthComputable) {
    alert(`Received ${event.loaded} of ${event.total} bytes`);
  } else {
    alert(`Received ${event.loaded} bytes`); // no Content-Length
  }

};

xhr.onerror = function() {
  alert("Request failed");
};

</script>

Since I already have the setRequestHeader(), why am I still getting the "Missing required request header. Must specify one of: origin,x-requested-with" error?

I also have the demo.html hosted on the same Apache, at /cors-anywhere/demo.html, and I have changed the cors_api_url:

var cors_api_url = 'http://192.168.xxx.yy:8080/';

But if I go to the demo page, and enter the URL: http://fakewava.whatever.com:7777/xhrtest/xhr-fakewava.html

and then click GET, I am seeing a strange behavior. Instead of the Javascript in that xhr-fakewava.html file being executed to do the GET, the web page is showing the contents of the xhr-fakewav.html, i.e., the demo.html seems to be loading the xhr-fakewava.html page, but the Javascript in that page is not executing.

Why might that be?

Thanks! I hope that I have provided enough information to help figure out how to fix these problems?

Jim

EDIT 1: BTW, one additional piece of information:

If I use the browser to get that xhr-fakewava.html page directly (i.e., without the CORS-Anywhere URL in front of the xhr-fakewava.html URL), the Javascript that is in the xhr-fakewava.html page DOES execute!

http://192.168.xxx.yy:7777/xhrtest/xhr-fakewava.html

Rob--W commented 2 years ago

Your code does clearly set the X-Requested-With header. If you're still getting the Missing required request header. Must specify one of: origin,x-requested-with response, then that means that the server did not receive the header.

A possible reason for this is if you configured your Apache to redirect to the proxy, instead of just passing through the request. With a redirect, custom client-side headers are dropped.

PS. Instead of opening multiple issues about the issues that you've encountered while deploying your instance, please post on the same issue. Even if the issue is closed, I can still see comments and respond.

ohaya commented 2 years ago

Hi,

Re. posting on same issue - sorry about that. I had seen you had closed an earlier issue after you had commented, so I assumed that was because you wanted to see a new issue for any followups.

Re. "if you configured your Apache to redirect to the proxy, instead of just passing through the request", I am not sure what you meant?

The xhr-fakewava.html file is on the Apache local filesystem under the htdocs directory, so the URL http://xxxxx:7777/xhrtest/xhr-fakewava.html should cause the Apache to just serve that HTML page from Apache machine's filesystem. So no redirect or rewrite there.

I am also unclear about what you meant when you said "instead of just passing through the request"... What(?) should be passing which(?) request to what (?)?

Also, I had asked about why the Javascript/XHR that is in the xhr-fakewava.html page is not being executed, but rather the content of the xhr-fakewava.html page is just being displayed.

What would cause that kind of behavior??

Jim

EDIT 1: Recall if I just point a browser directly to http://aaa.bbb.com:7777/xhrtest/xhr-fakewava.html that DOES execute the Javascript that is inside the xhr-fakewava.html page!!

Rob--W commented 2 years ago

Your code snippet contains xhr.setRequestHeader('X-Requested-With', 'XMLHttpRequest');.

You claim that you're getting Missing required request header. Must specify one of: origin,x-requested-with, which means two things: 1) the response is from CORS Anywhere and 2) the header that you have set is not present.

From these two, it follows that the final request to CORS Anywhere is not the original request, which can happen if you redirected the initial request.

Have you tried to use the Network tab of your browser's developer tools to see what's going on?

ohaya commented 2 years ago

Yes, I use the network tab constantly, but when I test, actually see nothing in the Network tab.

You haven't commented about the Javascript not executing. Is that because you think the message about the missing request header somehow related to the Javascript not executing.

The strange part about the not executing problem is that if I just load the HTML page into the browser (no CORS Anywhere), the Javascript does execute, so I know or think I know that it's not a problem with the HTML page itself.

Also do you have any suggestion how to debug both of these problems? Maybe adding some console.log() might help? What should I be looking for?

ohaya commented 2 years ago

Here's a shot from a test, with the web developer =>network:

image

FYI, I did another of the same test, but with Chrome and live http headers. There were no redirects, at least as seen in live http headers. There was just the one request, and then a 2nd request for favicon.ico.

Here is the header trace for the 1st (the only) request:

GET /http://fakewava.whatever.com:7777/xhrtest/xhr-fakewava.html HTTP/1.1 Host: 192.168.xxx.yy:8080 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.9 Upgrade-Insecure-Requests: 1 User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36

HTTP/1.1 400 Header required access-control-allow-origin: * access-control-expose-headers: access-control-allow-origin Connection: keep-alive Date: Thu, 23 Sep 2021 19:02:58 GMT Keep-Alive: timeout=5 Transfer-Encoding: chunked

I notice that the response has an "access-control-allow-origin: *", but I don't understand why since the request didn't have an Origin.

Rob--W commented 2 years ago

Opening the URL in the browser is NOT a good way to verify whether it works.

CORS Anywhere intentionally requires the request to be made with CORS or XMLHttpRequest/fetch, because the alternative (loading as a document) is a security concern - that would allow the proxied response to execute scripts.

ohaya commented 2 years ago

Do you have any idea about what else I can try/change/test?

ohaya commented 2 years ago

Is it possible that with the way I have this test setup, with both the CORS Anywhere and the Apache serving the page on the same machine, and the "faking" that there are 2 different hosts, by leveraging the DNS server, is causing some problem that we don't realize?

FYI, I am going to try to stand up a new Apache on anothere machine, and make that new machine the "fakewava.whatever.com" machine and see if that makes a difference...

Jim

ohaya commented 2 years ago

Hi Rob,

I think that you were right!!

I was just doing some testing with the demo.html (hosted here) and I had live headers on, and then I put in a URL like:

http://xxx.yyy.com:7777/

In live headers I am seeing a 301 response:

GET /http://xxx.yyy.com:7777/ HTTP/1.1
Host: 192.168.aaa.bb:8080
Accept: */*
User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36

HTTP/1.1 **301** Please use a direct request
access-control-allow-origin: *
access-control-expose-headers: access-control-allow-origin
cache-control: private
Date: Thu, 23 Sep 2021 20:38:25 GMT
location: http://xxx.yyy.com:7777/
vary: origin

In the web developer==> network, I see 2 requests:

1) Request URL: http://192.168.aaa.bb:8080/http://xxx.yyy:7777/

and then:

2) http://xxx.yyy.com:7777/

I assume that the 301 response AND the location header will cause a redirect, but I am unclear about why that redirect is happening? As far as I am aware, there isn't anything special about this Apache that cause a redirect.

I have seen that Apache does sometimes cause a redirect when the URL is missing a trailing "/" (so Apache causes a redirect which adds the "/" at the end).

Do you know or have a guess?

Also, assuming it is redirecting, would that cause the header to be gone/removed?

Jim

Rob--W commented 2 years ago

The very basic API of the CORS Anywhere server is very simple: Put the URL of the destination URL after the base URL of the proxy server.

Your experiment shows that you are concatenating many URLs. That doesn't make sense.

The 301 redirect happens because the first part of the URL is the same as the website itself, and that is deemed a coding error that would lead to an inefficient use of resources (without the 301 redirect an internal redirect would happen).

ohaya commented 2 years ago

I am really confused by what you said, when you said "the first part of the URL is the same as the website itself"????

Can you specifically show me where I had that? I may've have messed up pasting URLs in my msg, because it's a kind of pain to obfuscate them, but I don't think that I am doing what you said above.

BTW, I started doing that last set of tests because on one of the issues (not mine, but someone else's thread), you said to bring up the demo page and enter like "http://1.1.1.1", so I tried that, plus several other examples from you main page like "http://google.com", etc. and I just tried that again using my demo page here, and, for example, http://google.com, and that work.

Since the demo page is working with google, etc., can we start from that (we can circle back to the / case, later?

So anyway, then I entered the following URL into the demo page text box:

http://charlieeastweb.....com:7777/xhrtest/xhr-fakewava.html

and click GET, and what I see in the response text box is:

GET http://charlieeastweb.....com:7777/xhrtest/xhr-fakewava.html
200 OK

<html>

<body>

<script>
// 1. Create a new XMLHttpRequest object
let xhr = new XMLHttpRequest();

// 2. Configure it: GET-request for the URL /article/.../load
//xhr.open('GET', 'http://fakewava.whatever.com:7777/wavatarget/index.html');
//xhr.open('GET', 'http://192.168.157.23:7777/wavatarget/index.html');
xhr.open('GET', 'http://charlieeastweb08.sbx.gxaws.com:7777/wavatarget/index.html');
xhr.setRequestHeader('X-Requested-With', 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXMLHttpRequest');

alert(location.host);

// 3. Send the request over the network
xhr.send();
.
.
.

So this test, I was able to retrieve the xhr-fakewava.html page, which contains Javascript and XHR, but it is just treating that as text and not executing the Javascript/XHR.

Why is that?

When I look at the web developer =>Network I see 2 GET requests:

1) Request URL: http://192.168.xxx.yyy:8080/http://charlieeastweb....com:7777/xhrtest/xhr-fakewava.html Status code: 301

2) Request URL: http://charlieeastweb....com:7777/xhrtest/xhr-fakewava.html Status code: 200 OK

The, the demo is showing the text contents of the xhr-fakewava.html page.

I would've expected another request after #2 above, where the Javascript/XHR code inside the xhr-fakewava.html page was doing the XHR GET using the code I posted earlier, but there is no 3rd request in the web developer=>network display.

So using the demo page, it is at least not blowing up, but even though that is the case, the demo page is still not functioning fully correctly.

BTW, the URL I posted above:

http://charlieeastweb.....com:7777/xhrtest/xhr-fakewava.html

is literally (except for the obfuscated part I removed) the URL I put into the text box in the demo page. That is not a double URL... it is just the URL to get the HTML page that contains the Javascript.

NOTE that the 1st request above, has this URL:

http://192.168.xxx.yyy:8080/http://charlieeastweb....com:7777/xhrtest/xhr-fakewava.html

IS a double-URL (/, but that double URL was output from the web developer ==>network display.

I DID NOT enter the entire double-URL... into the demo page. I only entered the last part into the demo page text box:

http://charlieeastweb....com:7777/xhrtest/xhr-fakewava.html

Now, I have a question. I keep wondering I am misunderstanding how this is supposed to work.

Isn't the URL that I put into the demo page text box suppose to point to an (HTML) page that contains Javascript/XHR code to retrieve some resource (from a different domain)?

That was my understanding, so can you confirm that is correct or not?

Sorry for the long post.

Jim

EDIT 1: BTW that 1st request in web developer=>network appears as "xhr / Redirect":

image

ohaya commented 2 years ago

I have another question that I wanted to try to confirm with you:

When I am getting the "Missing required request header. Must specify one of: origin,x-requested-with" message, I understand that the recommendation was to add the xhr.setRequestHeader() to the Javascript.

I have been ASSUMING that the Javascript that needs to be modified is the Javascript in the xhr-fakewava.html HTML page (which is the HTML page that contains Javascript to do the XHR to get a target HTML test page I have, the "wavatarget.html" page).

Is the "xhr-fakewava.html" page the correct page where the xhr.setRequestHeader() needs to be added?

Thanks, Jim

ohaya commented 2 years ago

Actually, I was just testing with the demo page on your/this website, e.g., putting "http://google.com", and I am unclear - is it actually executing Javascript on the age that it is retrieving? It seems like it is working the way it is working on my test environment, i.e., it is just retrieving the text of the page and not executing anything?

ohaya commented 2 years ago

Rob,

I THINK that I have just had a potential "AHA" moment. I've been reading all kinds of stuff trying to understand more about CORS Anywhere, and I found this SO thread:

https://stackoverflow.com/questions/29670703/how-to-use-cors-anywhere-to-reverse-proxy-and-add-cors-headers

specifically the post by "Graham Hannington" on Dec 14, 2016.

Reading (and thinking about) what Graham said, I think that I may've had a misconception about what CORS Anywhere does (or how it does it), and I think at least on my side, I've been "coming from" that misconception so I had a really hard time understanding what you've been saying.

Let me explain (and you can tell me if what I am thinking now is right or wrong).

Ok, originally I thought that the CORS Anywhere is facilitate us to retrieve the HTML/Javascript page that we have from a different domain, and that that/our HTML/Javascript page would then execute the Javascript and use XHR to retrieve (for example) a page from another different domain.

However, I think that what I was thinking is the wrong scenario.

From reading Graham's post (among others), I think that in the scenario above I was using CORS Anywhere and the double URL in the WRONG PLACE.

I think that your intention (and what CORS anywhere helps do) is that it helps us in scenarios where, if we have a HTML/Javascript/XHR/Fetch page that is needing to retrieve a resource, and the RESOURCE is in a different domain, then we would modify the original URL that is pointing to the resource, in the code, and make that into a double URL, and then after that is done, the Javascript/XHR/Fetch will be able to retrieve the resource from the other domain without dealing with the CORs request headers (e.g., Origin) and response header (e.g., the ACAO header).

Is that correct???

If it it, then, man, I have to apologize for being so confused :(!!!!

And if that is correct, then it explains why, with my test, I am just getting the text of my Javascript page, instead of that Javascript code executing!!

Please let me know, because if the above is correct, then I have a lot of thinking to do to try to see if we can leverage CORS Anywhere (I think the new concept does, but I don't want to evaluate how it fits until I know whether I got the right understanding or not).

Thanks, Jim

EDIT: P.S. If my new understanding IS correct, I/we still need to figure out why, when I use the double URLs, I am getting that message about requiring the header.