Open toniritter opened 3 years ago
After switch Dependency to 2.51.0 version, the exception is not thrown anymore but still i'm on the "Unsupported" page https://flashscore.com/unsupported/
The browser detection is done using this https://www.flashscore.com/x/js/browsercompatibility_4.js code
// !!! for update iterate manually `browser_compatibility_serial`
"use strict";
try {
(function () {
var cssRequirements = [["display", "flex"], ["display", "grid"], ["color", "red"]];
for (var i in cssRequirements) {
if (!CSS.supports(cssRequirements[i][0], cssRequirements[i][1])) {
throw "no-" + cssRequirements[i][0] + "-" + cssRequirements[i][1];
}
}
try {
new XMLHttpRequest();
}
catch (pass) {
throw "no-ajax";
}
try {
eval("var foo = (x)=>x+1");
}
catch (pass) {
throw "no-es6";
}
try {
eval("var foo = {}; var bar = {...foo};")
}
catch (pass) {
throw "no-spread";
}
})();
}
catch (e) {
var utm = "";
if (typeof e == "string" && /^[a-z0-9\-]+$/.test(e)) {
utm = "?err=" + e;
}
window.location.replace("/unsupported/" + utm);
}
For the moment i can fix CSS.supports() but because Rhino not (yet) supports the spread syntax (https://github.com/mozilla/rhino/issues/968) this will still fail.
The only option you have is to 'patch' the script and replace comment out some parts (see https://htmlunit.sourceforge.io/faq.html#HowToModifyRequestOrResponse). At least it is worth a try
Have done a fix for CSS.supports() - will make a new snapshot available soon (check twitter for updates)
I've done it as suggested and try modify the response but got now following exception on it (still on version 2.51.0
2021-07-12 19:23:13.844 ERROR 2820 --- [nio-8080-exec-2] o.a.c.c.C.[.[.[/].[dispatcherServlet] : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is com.gargoylesoftware.htmlunit.ScriptException: syntax error (https://www.flashscore.com/x/js/browsercompatibility_4.js#1)] with root cause
net.sourceforge.htmlunit.corejs.javascript.EvaluatorException: syntax error (https://www.flashscore.com/x/js/browsercompatibility_4.js#1)
at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory$HtmlUnitErrorReporter.error(HtmlUnitContextFactory.java:436) ~[htmlunit-2.51.0.jar:2.51.0]
at net.sourceforge.htmlunit.corejs.javascript.Parser.addError(Parser.java:251) ~[htmlunit-core-js-2.51.0.jar:na]
looks like there is a syntax error in your replaced script - maybe you can replace it by an empty one?
Hey rbri, i've tried it meanwhile with this but it will still faile:
public void startScraper() throws FailingHttpStatusCodeException, MalformedURLException, IOException {
String url = "https://www.flashscore.com/basketball/";
try (final WebClient webClient = new WebClient(BrowserVersion.BEST_SUPPORTED)) {
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getOptions().setUseInsecureSSL(true);
webClient.getOptions().setCssEnabled(true);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.waitForBackgroundJavaScriptStartingBefore(1000);
new WebConnectionWrapper(webClient) {
public WebResponse getResponse(WebRequest request) throws IOException {
WebResponse response = super.getResponse(request);
if (request.getUrl().toExternalForm().contains("browsercompatibility")) {
String content = "";
// intercept and/or change content
WebResponseData data = new WebResponseData(content.getBytes(),response.getStatusCode(), response.getStatusMessage(), response.getResponseHeaders());
response = new WebResponse(data, request, response.getLoadTime());
}
return response;
}
};
HtmlPage page = webClient.getPage(url);
webClient.waitForBackgroundJavaScript(3_000);
System.out.println();
System.out.println();
System.out.println("----------------");
System.out.println(page.asNormalizedText());
System.out.println("----------------");
}
}
2021-07-16 15:22:45.844 WARN 1524 --- [ main] c.g.htmlunit.DefaultCssErrorHandler : CSS error: 'https://www.flashscore.com/res/_fs/build/livetableresponsive.c7059bf.css' [1:8910] Error in pseudo class or element. (Invalid token ".". Was expecting one of: <S>, <NUMBER>, <IDENT>, <STRING>, "-", <PLUS>, <DIMENSION>.)
2021-07-16 15:22:45.844 WARN 1524 --- [ main] c.g.htmlunit.DefaultCssErrorHandler : CSS warning: 'https://www.flashscore.com/res/_fs/build/livetableresponsive.c7059bf.css' [1:8910] Ignoring the whole rule.
2021-07-16 15:22:46.305 WARN 1524 --- [ main] c.g.htmlunit.IncorrectnessListenerImpl : Obsolete content type encountered: 'text/javascript'.
2021-07-16 15:22:46.487 ERROR 1524 --- [ main] c.g.h.j.DefaultJavaScriptErrorListener : Error during JavaScript execution
com.gargoylesoftware.htmlunit.ScriptException: invalid property id (https://www.flashscore.com/res/_fs/build/loader.5714507.js#1)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:954) ~[htmlunit-2.51.0.jar:2.51.0]
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:580) ~[htmlunit-core-js-2.51.0.jar:na]
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:481) ~[htmlunit-core-js-2.51.0.jar:na]
at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.callSecured(HtmlUnitContextFactory.java:352) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.compile(JavaScriptEngine.java:785) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.compile(JavaScriptEngine.java:751) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.compile(JavaScriptEngine.java:112) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.html.HtmlPage.loadJavaScriptFromUrl(HtmlPage.java:1122) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.html.HtmlPage.loadExternalJavaScriptFile(HtmlPage.java:1002) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.html.ScriptElementSupport.executeScriptIfNeeded(ScriptElementSupport.java:196) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.html.ScriptElementSupport$1.execute(ScriptElementSupport.java:120) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.html.ScriptElementSupport.onAllChildrenAddedToPage(ScriptElementSupport.java:143) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:191) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder.endElement(HtmlUnitNekoDOMBuilder.java:551) ~[htmlunit-2.51.0.jar:2.51.0]
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source) ~[xercesImpl-2.12.0.jar:na]
at com.gargoylesoftware.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder.endElement(HtmlUnitNekoDOMBuilder.java:503) ~[htmlunit-2.51.0.jar:2.51.0]
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1216) ~[neko-htmlunit-2.51.0.jar:2.51.0]
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1156) ~[neko-htmlunit-2.51.0.jar:2.51.0]
at net.sourceforge.htmlunit.cyberneko.filters.DefaultFilter.endElement(DefaultFilter.java:219) ~[neko-htmlunit-2.51.0.jar:2.51.0]
at net.sourceforge.htmlunit.cyberneko.filters.NamespaceBinder.endElement(NamespaceBinder.java:312) ~[neko-htmlunit-2.51.0.jar:2.51.0]
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3189) ~[neko-htmlunit-2.51.0.jar:2.51.0]
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2114) ~[neko-htmlunit-2.51.0.jar:2.51.0]
at net.sourceforge.htmlunit.cyberneko.HTMLScanner.scanDocument(HTMLScanner.java:937) ~[neko-htmlunit-2.51.0.jar:2.51.0]
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:443) ~[neko-htmlunit-2.51.0.jar:2.51.0]
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:394) ~[neko-htmlunit-2.51.0.jar:2.51.0]
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) ~[xercesImpl-2.12.0.jar:na]
at com.gargoylesoftware.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder.parse(HtmlUnitNekoDOMBuilder.java:751) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.html.parser.neko.HtmlUnitNekoHtmlParser.parse(HtmlUnitNekoHtmlParser.java:208) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:297) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:217) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:684) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:586) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:501) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:413) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:548) ~[htmlunit-2.51.0.jar:2.51.0]
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:529) ~[htmlunit-2.51.0.jar:2.51.0]
Caused by: net.sourceforge.htmlunit.corejs.javascript.EvaluatorException: invalid property id (https://www.flashscore.com/res/_fs/build/loader.5714507.js#1)
Looks like another error - this time
invalid property id (https://www.flashscore.com/res/_fs/build/loader.5714507.js#1)
And this js is a huge minimized javascript. At least this uses the not supported syntax
function(...e){let t=this._configData;
I fear you have to wait until this is fixed in Rhino.
see #755
based on JavaScript execution exeption question on Stackoverflow
HtmlUnit Version: 2.50.0
During getPage call of webpage flashscore.com, i got following exceptions
I've tried with two different classes and problem still occur.