Closed lilo-san closed 4 years ago
Thanks for your error report - to deeper analyze the problem i need some help
Looks like a problem with the preflight requests but i need more details....
Hi @rbri,
I'm indeed using version 2.43.0. There is no preflight request, maybe that is the problem?
This server will always response with the header: Access-Control-Allow-Origin = *
Why? Well, is purely an API server that initially everybody is allowed to call, and not necessarily a browser, it has its own Authentication/Authorisation on the JSONRPC2 Protocol.
This works perfectly on browsers. But well, a lot of stuff works in browsers that it shouldn't.
It's a missing preflight request against the specification?
Here is the Charles Proxy Export for the Safari Browser: https://www.dropbox.com/s/6kwzm4wv3ylp4fp/Working-Safari.chls?dl=0
Sadly I have been unable to record HTMLUnit until now even if is the same server and I'm reaching it using the same URL.
Is there any particular trick/guide to set HTMLUnit to go though Charles? I will continue looking at it.
define charles as proxy when setting up you web client
try (final WebClient webClient = new WebClient(BrowserVersion.BEST_SUPPORTED, "localhost", 8888)) {
do you have a chance to debug into the htmlunit code?
Does your code sets the withCredentials property of the request for the second call?
I have just tried with:
try (final WebClient webClient = new WebClient(BrowserVersion.BEST_SUPPORTED, "localhost", 8888)) {
To my amusement, using the proxy I can not only record it, but it works, making this futile if what we want is to get the stack trace.
https://www.dropbox.com/s/ahm1fqddjo0dc48/Working-HTMLUnit.chls?dl=0
My code doesn't change anything on the Java side, no withCredentials, no anything, is just as plain as shown above.
I have not tried to debug HtmlUnit, would you like to get the state of any particular class?
Can you please check if it is related to the browser version or to charles
Am 3. September 2020 20:44:46 MESZ schrieb lilo-san notifications@github.com:
I have just tried with:
try (final WebClient webClient = new WebClient(BrowserVersion.BEST_SUPPORTED, "localhost", 8888)) {
To my amusement, using the proxy I can not only record it, but it works, making this futile if what we want is to get the stack trace.
https://www.dropbox.com/s/ahm1fqddjo0dc48/Working-HTMLUnit.chls?dl=0
My code doesn't change anything on the Java side, no withCredentials, no anything, is just as plain as shown above.
I have not tried to debug HtmlUnit, would you like to get the state of any particular class?
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/HtmlUnit/htmlunit/issues/222#issuecomment-686689685
-- Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
Another reason might be the server adds some whitespace to the star in the header (cleaned by charles)
Am 3. September 2020 20:44:46 MESZ schrieb lilo-san notifications@github.com:
I have just tried with:
try (final WebClient webClient = new WebClient(BrowserVersion.BEST_SUPPORTED, "localhost", 8888)) {
To my amusement, using the proxy I can not only record it, but it works, making this futile if what we want is to get the stack trace.
https://www.dropbox.com/s/ahm1fqddjo0dc48/Working-HTMLUnit.chls?dl=0
My code doesn't change anything on the Java side, no withCredentials, no anything, is just as plain as shown above.
I have not tried to debug HtmlUnit, would you like to get the state of any particular class?
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/HtmlUnit/htmlunit/issues/222#issuecomment-686689685
-- Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
Debugging the HTMLUnit code, it seems this is the code throwing the exception:
if (!allowOriginResponse) {
if (LOG.isDebugEnabled()) {
LOG.debug("No permitted \"Access-Control-Allow-Origin\" header for URL " + this.webRequest_.getUrl());
}
throw new IOException("No permitted \"Access-Control-Allow-Origin\" header.");
}
What doesn't make sense, if we look at the Charles output because the header is there.
I also noticed that now debugging step by step, for some reason I can't explain yet, it works.
The server doesn't add any whitespaces, just checked.
Are synchronous XMLHttpRequest well supported?
Please try with a breakpoint at the throw statement maybe this gives a hint
Am 3. September 2020 20:55:41 MESZ schrieb lilo-san notifications@github.com:
Debugging the HTMLUnit code, it seems this is the code throwing the exception:
if (!allowOriginResponse) { if (LOG.isDebugEnabled()) { LOG.debug("No permitted \"Access-Control-Allow-Origin\" header for URL " + this.webRequest_.getUrl()); } throw new IOException("No permitted \"Access-Control-Allow-Origin\" header."); }
What doesn't make sense, if we look at the Charles output because the header is there.
I also noticed that now debugging step by step, for some reason I can't explain yet, it works.
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/HtmlUnit/htmlunit/issues/222#issuecomment-686694908
-- Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
Do you reuse the same xmlrequest instance ?
Am 3. September 2020 20:55:41 MESZ schrieb lilo-san notifications@github.com:
Debugging the HTMLUnit code, it seems this is the code throwing the exception:
if (!allowOriginResponse) { if (LOG.isDebugEnabled()) { LOG.debug("No permitted \"Access-Control-Allow-Origin\" header for URL " + this.webRequest_.getUrl()); } throw new IOException("No permitted \"Access-Control-Allow-Origin\" header."); }
What doesn't make sense, if we look at the Charles output because the header is there.
I also noticed that now debugging step by step, for some reason I can't explain yet, it works.
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/HtmlUnit/htmlunit/issues/222#issuecomment-686694908
-- Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
I don't reuse it in Javascript, this is everything, a new call to sendHttpPost is used every time.
var sendHttpPost = function(url, payload) {
var request = new XMLHttpRequest();
var method = 'POST';
var shouldBeAsync = false;
request.open(method, url['@value'], shouldBeAsync);
request.send(payload);
if (request.status === 200) {
return request.responseText;
} else {
throw new Error('failed call to server - status: ' + request.status);
}
}
Looking that doing a set by step works but putting a breakpoint on the throw I can see the state when is not working I'm looking for race conditions or some oversight when doing synchronous requests.
I cross my finger and hop you can find the reason
Am 3. September 2020 21:15:53 MESZ schrieb lilo-san notifications@github.com:
I don't reuse it in Javascript, this is everything, a new call to sendHttpPost is used every time.
var sendHttpPost = function(url, payload) { var request = new XMLHttpRequest(); var method = 'POST'; var shouldBeAsync = false; request.open(method, url['@value'], shouldBeAsync); request.send(payload); if (request.status === 200) { return request.responseText; } else { throw new Error('failed call to server - status: ' + request.status); } }
Looking that doing a set by step works but putting a breakpoint on the throw I can see the state when is not working I'm looking for race conditions or some oversight when doing synchronous requests.
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/HtmlUnit/htmlunit/issues/222#issuecomment-686704457
-- Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
Looking at the state of HTMLUnit XMLHttpRequest class: https://www.dropbox.com/s/r34zhbvyc6xr0bg/call-never-made.png?dl=0
You can see on the watch that: webResponse.getResponseHeaderValue("Access-Control-Allow-Origin") is null.
What makes sense because HTMLUnit never called the server! I have literally put a log line where the packages enter the HTTP server, no call.
The normal behaviour in 2 is that you get the header back.
Does this sound any bells?
Here you have the state of request and response objects when the error happens.
https://www.dropbox.com/s/c188z2ffjib4r5z/call-never-made-2.png?dl=0
Obviously the response is empty.
Without some suggestions I don't think I can find the problem easily.
Will have a look at this tomorrow
Am 3. September 2020 21:36:51 MESZ schrieb lilo-san notifications@github.com:
Looking at the state of HTMLUnit XMLHttpRequest class: https://www.dropbox.com/s/r34zhbvyc6xr0bg/call-never-made.png?dl=0
You can see on the watch that: webResponse.getResponseHeaderValue("Access-Control-Allow-Origin") is null.
- allowOriginResponse IS set to true by default on line 754.
webResponse.getResponseHeaderValue(HttpHeader.ACCESS_CONTROL_ALLOW_ORIGIN) in line 756 returns null.
What makes sense because HTMLUnit never called the server! I have literally put a log line where the packages enter the HTTP server, no call.
The normal behaviour in 2 is that you get the header back.
Does this sound any bells?
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/HtmlUnit/htmlunit/issues/222#issuecomment-686714086
-- Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
Thanks for your support today, hopefully we can find out what is going on.
One last note:
Regarding line 747 in the XMLHttpRequest class that returns null instead of the actual response.
747 final WebResponse webResponse = wc.loadWebResponse(webRequest_);
If I make the same call using a watch on the breakpoint it returns the actual response without issues. AND! If I simulate a 10 seconds wait on my Javascript code between requests.
So, why loadWebResponse can't handle quick consecutive calls?
So, why loadWebResponse can't handle quick consecutive calls?
Good point - have no idea at the moment. Do you use the lastest HttpClient?
Will be great if you can dig a bit more deeper in the mud of com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebRequest, int). My wild guess yesterday was the caching but you are using post requests and if you wait a bit it works. Strange.... have not seen something like that before.
And regarding Charles - do you think the impact of Charles on the timing is the point here?
I will dig more latter, I prefer to not to speculate.
If I can at least get the original error from the failed connection from the HTMLUnit http client, we can learn more.
My Java client that uses plain java.net.HttpURLConnection works successfully.
Regarding Charles, "maybe" but not necessarily, I needed to add more waiting time in my Javascript code that latency is added by Charles so "maybe" something else is in play.
What I found until now is an error from the Apache HTTP Client: org.apache.http.NoHttpResponseException Stack-trace below.
A "Possible" explanation could be related with: https://stackoverflow.com/questions/10558791/apache-httpclient-interim-error-nohttpresponseexception
And a "Possible" solution here: https://stackoverflow.com/questions/29006222/how-to-solve-org-apache-http-nohttpresponseexception
Is there any way to tell the Apache HTTP Client to not to reuse connections on HTMLUnit?
---- Stacktrace from org.apache.http.NoHttpResponseException org.apache.http.NoHttpResponseException: localhost:8080 failed to respond at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72) at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:193) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1537) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1456) at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest.doSend(XMLHttpRequest.java:747) at com.gargoylesoftware.htmlunit.javascript.host.xml.XMLHttpRequest.send(XMLHttpRequest.java:610) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at net.sourceforge.htmlunit.corejs.javascript.MemberBox.invoke(MemberBox.java:188) at net.sourceforge.htmlunit.corejs.javascript.FunctionObject.call(FunctionObject.java:457) at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1697) at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:1013) at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:111) at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:427) at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall(HtmlUnitContextFactory.java:340) at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3640) at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.exec(InterpretedFunction.java:123) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$2.doRun(JavaScriptEngine.java:800) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:914) at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:619) at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:537) at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.callSecured(HtmlUnitContextFactory.java:354) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:809) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:785) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:776) at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScript(HtmlPage.java:943) at com.gargoylesoftware.htmlunit.html.HtmlScript.executeInlineScriptIfNeeded(HtmlScript.java:305) at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:395) at com.gargoylesoftware.htmlunit.html.HtmlScript$2.execute(HtmlScript.java:234) at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:256) at com.gargoylesoftware.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder.endElement(HtmlUnitNekoDOMBuilder.java:560) at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source) at com.gargoylesoftware.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder.endElement(HtmlUnitNekoDOMBuilder.java:514) at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1192) at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1132) at net.sourceforge.htmlunit.cyberneko.filters.DefaultFilter.endElement(DefaultFilter.java:219) at net.sourceforge.htmlunit.cyberneko.filters.NamespaceBinder.endElement(NamespaceBinder.java:312) at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3189) at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2114) at net.sourceforge.htmlunit.cyberneko.HTMLScanner.scanDocument(HTMLScanner.java:937) at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:443) at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:394) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at com.gargoylesoftware.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder.parse(HtmlUnitNekoDOMBuilder.java:760) at com.gargoylesoftware.htmlunit.html.parser.neko.HtmlUnitNekoHtmlParser.parse(HtmlUnitNekoHtmlParser.java:208) at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:283) at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:163) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:638) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:451) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:368) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:520) at com.translator.javascript.APITest.javascriptAPITest(APITest.java:39)
Long story short:
The default in HTTP 1.1 if nothing is specified is to use the header: connection=keep-alive.
The server was closing the connection after returning the response, but not adding the header: connection=close
It seems the Apache HTTP Client don't verifies that the stream has been closed.
Adding the header on the Server solves the issue.
@rbri Thanks for your support on this.
Wow, many thanks for all the analysis.
Should i close this or do you think i have to improve HtmlUnit regarding this?
Some would say the Apache Client should close the connection if the stream is closed.
Some would say the Server should put the header in place.
In my opinion is not worth changing your http client if this doesn't happen more often, because of the amount of work.
But most people will not spend 4-5 hours debugging a library they are just trying for the first time, or report the issue, they will just move to the next one.
Now the things that can be easily improved "In my opinion":
The error that is thrown, the initial error thrown by the library is incorrect:
java.io.IOException: No permitted "Access-Control-Allow-Origin" header.
Was actually hiding the real issue, that may not be on HTMLUnit, miss-directing the debugging and giving a bad impression about the tool to a new user.
The real exception was:
org.apache.http.NoHttpResponseException: localhost:8080 failed to respond
I'm not keen in wrapping or dismissing connectivity exceptions so system administration can quickly notice when things go south due to some environmental issue, etc...
So users can just look it up saving you support time.
If any of those points resonates with you, please take action, the issue itself I consider closed : )
Totally agree, will try to build a test case for this and improve the error handling.
Reminder for myself - enhance NoHttpResponseTest.
Still working on that.....
Made some progress here - if you like you can try the latest snapshot. Maybe i will add some more logging.
Wrote some more tests for this, maybe i have to add some test for anchor processing.
Hopefully have fixed all cases now - will close this.
I tested that my code works perfectly in Chrome, but fails on HtmlUnit. All my code is pretty basic old style Javascript that I use to test a JSON-RPC2 API.
Steps to reproduce.
For the first request the server returns with the response the header "Access-Control-Allow-Origin = *", this is expected, is an API usable from any domain.
Now send a second request from the client, for some reason the client throws an exception: No permitted "Access-Control-Allow-Origin" header. The request never reaches the server. And as shown on the code above the request doesn't include that header, so why the client complains like is there.
My wild guess is that now the client is trying to put that header because it received it from the server but is not a valid request header. I have total control of the client and server running on my unit tests. In case I can try anything let me know.
This is all my HtmlUnit code:
Stack-trace below: