Open okeuday opened 4 years ago
I also encountered this problem, helpppppp
Well ... this is a bit difficult. HTMLUnit successfully renders the page and then I'd assume it's not executing the javascript that is supposed to run. It most likely doesn't trigger some javascript that is expected to be triggered. However the javascript of the page is 700 kb and not really meant to be read by humans ;-)
Do you guys know what should be triggered?
@twendelmuth When I use curl, https://optout.aboutads.info/ is a 302 to https://optout.aboutads.info/?c=2&lang=EN which then provides similar HTML with two javascript files:
https://optout.aboutads.info/scripts/3b40df82.shiv-home.js (at the top, 2639 bytes)
https://optout.aboutads.info/scripts/600fbcb3.home.js (at the bottom, the 674 KiB one mentioned)
Using the js-beautify
command-line utility (on Linux), it is easy to format the javascript. The formatted files are now at https://gist.github.com/okeuday/23615961a18897b979ceb9052c6b73be .
That initial page is trying to do some HTML5 graphics with a progress bar and afterwards a dialog box that must be drawn over the content (while the javascript loads content underneath). So using a properly rendered page in HTMLUnit would require being able to click the dialog box button and other interaction. The output from HTMLUnit is similar to the output from curl and it seems like the javascript may be avoiding its changes to the HTML due to HTML5 checks, but I am not sure.
Tested with htmlunit 2.40.0 . The URL is "https://optout.aboutads.info/" and may be difficult to get working, but I wanted to report the problem attempting to get the raw content of the page while disabling the logging of htmlunit errors (with the expectation of typical javascript problems). The URL does some javascript processing, providing a progress bar and eventually getting to a window with a continue button. The content htmlunit returns shows that it is unable to get to the window (with various text information) with a continue button (unable to click it to continue). Was using the
BrowserVersion.BEST_SUPPORTED
for the request.The XML that is currently returned is below: