HtmlUnit / htmlunit

HtmlUnit is a "GUI-Less browser for Java programs".
https://www.htmlunit.org
Apache License 2.0
873 stars 172 forks source link

Page is unable to completely render due to JavaScript #172

Open okeuday opened 4 years ago

okeuday commented 4 years ago

Tested with htmlunit 2.40.0 . The URL is "https://optout.aboutads.info/" and may be difficult to get working, but I wanted to report the problem attempting to get the raw content of the page while disabling the logging of htmlunit errors (with the expectation of typical javascript problems). The URL does some javascript processing, providing a progress bar and eventually getting to a window with a continue button. The content htmlunit returns shows that it is unable to get to the window (with various text information) with a continue button (unable to click it to continue). Was using the BrowserVersion.BEST_SUPPORTED for the request.

The XML that is currently returned is below:

<?xml version="1.0" encoding="UTF-8"?>
<html ng-controller="MainController" class="js flexbox no-flexboxlegacy canvas canvastext webgl touch geolocation postmessage no-websqldatabase no-indexeddb hashchange history draganddrop websockets rgba hsla multiplebgs backgroundsize borderimage borderradius boxshadow textshadow opacity cssanimations csscolumns cssgradients no-cssreflections csstransforms no-csstransforms3d csstransitions fontface no-generatedcontent video audio localstorage sessionstorage webworkers applicationcache svg inlinesvg smil svgclippaths ng-scope" ng-app="naibcApp">
  <!-- <![endif]-->  <head>
    <style type="text/css">
      [uib-typeahead-popup].dropdown-menu{display:block;}
    </style>
    <style type="text/css">
      .uib-time input{width:50px;}
    </style>
    <style type="text/css">
      [uib-tooltip-popup].tooltip.top-left &gt; .tooltip-arrow,[uib-tooltip-popup].tooltip.top-right &gt; .tooltip-arrow,[uib-tooltip-popup].tooltip.bottom-left &gt; .tooltip-arrow,[uib-tooltip-popup].tooltip.bottom-right &gt; .tooltip-arrow,[uib-tooltip-popup].tooltip.left-top &gt; .tooltip-arrow,[uib-tooltip-popup].tooltip.left-bottom &gt; .tooltip-arrow,[uib-tooltip-popup].tooltip.right-top &gt; .tooltip-arrow,[uib-tooltip-popup].tooltip.right-bottom &gt; .tooltip-arrow,[uib-tooltip-html-popup].tooltip.top-left &gt; .tooltip-arrow,[uib-tooltip-html-popup].tooltip.top-right &gt; .tooltip-arrow,[uib-tooltip-html-popup].tooltip.bottom-left &gt; .tooltip-arrow,[uib-tooltip-html-popup].tooltip.bottom-right &gt; .tooltip-arrow,[uib-tooltip-html-popup].tooltip.left-top &gt; .tooltip-arrow,[uib-tooltip-html-popup].tooltip.left-bottom &gt; .tooltip-arrow,[uib-tooltip-html-popup].tooltip.right-top &gt; .tooltip-arrow,[uib-tooltip-html-popup].tooltip.right-bottom &gt; .tooltip-arrow,[uib-tooltip-template-popup].tooltip.top-left &gt; .tooltip-arrow,[uib-tooltip-template-popup].tooltip.top-right &gt; .tooltip-arrow,[uib-tooltip-template-popup].tooltip.bottom-left &gt; .tooltip-arrow,[uib-tooltip-template-popup].tooltip.bottom-right &gt; .tooltip-arrow,[uib-tooltip-template-popup].tooltip.left-top &gt; .tooltip-arrow,[uib-tooltip-template-popup].tooltip.left-bottom &gt; .tooltip-arrow,[uib-tooltip-template-popup].tooltip.right-top &gt; .tooltip-arrow,[uib-tooltip-template-popup].tooltip.right-bottom &gt; .tooltip-arrow,[uib-popover-popup].popover.top-left &gt; .arrow,[uib-popover-popup].popover.top-right &gt; .arrow,[uib-popover-popup].popover.bottom-left &gt; .arrow,[uib-popover-popup].popover.bottom-right &gt; .arrow,[uib-popover-popup].popover.left-top &gt; .arrow,[uib-popover-popup].popover.left-bottom &gt; .arrow,[uib-popover-popup].popover.right-top &gt; .arrow,[uib-popover-popup].popover.right-bottom &gt; .arrow,[uib-popover-html-popup].popover.top-left &gt; .arrow,[uib-popover-html-popup].popover.top-right &gt; .arrow,[uib-popover-html-popup].popover.bottom-left &gt; .arrow,[uib-popover-html-popup].popover.bottom-right &gt; .arrow,[uib-popover-html-popup].popover.left-top &gt; .arrow,[uib-popover-html-popup].popover.left-bottom &gt; .arrow,[uib-popover-html-popup].popover.right-top &gt; .arrow,[uib-popover-html-popup].popover.right-bottom &gt; .arrow,[uib-popover-template-popup].popover.top-left &gt; .arrow,[uib-popover-template-popup].popover.top-right &gt; .arrow,[uib-popover-template-popup].popover.bottom-left &gt; .arrow,[uib-popover-template-popup].popover.bottom-right &gt; .arrow,[uib-popover-template-popup].popover.left-top &gt; .arrow,[uib-popover-template-popup].popover.left-bottom &gt; .arrow,[uib-popover-template-popup].popover.right-top &gt; .arrow,[uib-popover-template-popup].popover.right-bottom &gt; .arrow{top:auto;bottom:auto;left:auto;right:auto;margin:0;}[uib-popover-popup].popover,[uib-popover-html-popup].popover,[uib-popover-template-popup].popover{display:block !important;}
    </style>
    <style type="text/css">
      .uib-datepicker-popup.dropdown-menu{display:block;float:none;margin:0;}.uib-button-bar{padding:10px 9px 2px;}
    </style>
    <style type="text/css">
      .uib-position-measure{display:block !important;visibility:hidden !important;position:absolute !important;top:-9999px !important;left:-9999px !important;}.uib-position-scrollbar-measure{position:absolute !important;top:-9999px !important;width:50px !important;height:50px !important;overflow:scroll !important;}.uib-position-body-scrollbar-measure{overflow:scroll !important;}
    </style>
    <style type="text/css">
      .uib-datepicker .uib-title{width:100%;}.uib-day button,.uib-month button,.uib-year button{min-width:100%;}.uib-left,.uib-right{width:100%}
    </style>
    <style type="text/css">
      .ng-animate.item:not(.left):not(.right){-webkit-transition:0s ease-in-out left;transition:0s ease-in-out left}
    </style>
    <style type="text/css">
      @charset "UTF-8";[ng\:cloak],[ng-cloak],[data-ng-cloak],[x-ng-cloak],.ng-cloak,.x-ng-cloak,.ng-hide:not(.ng-hide-animate){display:none !important;}ng\:form{display:block;}.ng-animate-shim{visibility:hidden;}.ng-anchor{position:absolute;}
    </style>
    <meta http-equiv="X-UA-Compatible" content="IE=edge"/>
    <meta charset="utf-8"/>
    <meta name="viewport" content="width=device-width"/>
    <base href="/"/>
    <title translate="global.title" class="ng-scope">
      WebChoices: Digital Advertising Alliance's Consumer Choice Tool for Web US
    </title>
    <link rel="P3Pv1" href="/w3c/p3p.xml" type="text/xml"/>
    <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon"/>
    <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Oswald:400,300,700"/>
    <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Gudea:400,700,400italic"/>
    <link rel="stylesheet" href="/styles/f2e2d05b.vendor-home.css"/>
    <link rel="stylesheet" href="/styles/5351c859.home.css"/>
    <!--[if lte IE 8]>
        <style>.navbar-collapse{ display: none !important; }</style>
        <!--<![endif]-->    <!--[if lte IE 8]>
        <script src="/scripts/3b40df82.shiv-home.js"></script>
        <!--<![endif]-->    <!-- Determine the theme on the server and set a variable to determine it at runtime. -->    <script async="" src="//www.google-analytics.com/analytics.js">
    </script>
    <script type="text/javascript" src="/theme">
    </script>
  </head>
  <body class="client-daa" ng-class="{'is-admin': isAuthorized(userRoles.admin)}">
    <div class="loading-overlay" ng-hide="hideOverlay">
    </div>
    <div class="navbar navbar-default" role="navigation">
      <div class="container">
        <div class="navbar-title">
          <a href="/">
            <span translate="" class="ng-scope">
              WebChoices: Digital Advertising Alliance's Consumer Choice Tool for Web US
            </span>
          </a>
        </div>
        <div class="navbar-header">
          <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar-collapse">
            <span class="sr-only">
              Toggle navigation
            </span>
            <span class="icon-bar">
            </span>
            <span class="icon-bar">
            </span>
            <span class="icon-bar">
            </span>
          </button>
          <span class="language-menu language-menu__mobile">
            <!-- ngIf: client === 'daa' -->            <span ng-if="client === 'daa'" class="dropdown nav navbar-nav navbar-right ng-scope">
              <a data-toggle="dropdown">
                <span class="glyphicon glyphicon-globe language-menu__icon" aria-hidden="true">
                </span>
              </a>
              <ul class="dropdown-menu">
                <li>
                  <a ng-hide="theme==='daac' &amp;&amp; lang==='fr'" href="" ng-click="changeLanguage('daac', 3,'fr')" class="">
                    Fran?ais Canadien
                  </a>
                  <a ng-hide="theme==='daac' &amp;&amp; lang==='en'" href="" ng-click="changeLanguage('daac', 3,'en')" class="">
                    Canadian English
                  </a>
                  <a ng-hide="theme==='apda'" href="" ng-click="changeLanguage('apda', 4,'es')" class="">
                    Espa?ol Latinoamericano
                  </a>
                  <a ng-hide="theme==='daa' &amp;&amp; lang==='en'" href="" ng-click="changeLanguage('daa', 2, 'en')" class="">
                    US English
                  </a>
                  <a ng-hide="theme==='daa' &amp;&amp; lang==='es'" href="" ng-click="changeLanguage('daa', 2, 'es')" class="">
                    Espa?ol de Estados Unidos
                  </a>
                </li>
              </ul>
            </span>
            <!-- end ngIf: client === 'daa' -->          </span>
          <a class="navbar-brand" ng-href="https://optout.aboutads.info/" href="https://optout.aboutads.info/">
            <img ng-src="/images/daa/nav_logo.png" alt="DAA Logo" src="/images/daa/nav_logo.png"/>
          </a>
        </div>
        <div class="collapse navbar-collapse" id="navbar-collapse" ng-switch="authenticated">
          <ul class="nav navbar-nav navbar-right external-links">
            <!-- ngSwitchWhen: false -->            <li ng-switch-when="false" class="dropdown ng-scope">
              <button type="button" class="dropdown-toggle" data-toggle="dropdown">
                <span class="sr-only">
                  Toggle navigation
                </span>
                <span class="icon-bar">
                </span>
                <span class="icon-bar">
                </span>
                <span class="icon-bar">
                </span>
              </button>
              <ul class="dropdown-menu">
                <li>
                  <a translate="" href="https://youradchoices.com/pmc" target="_blank" class="ng-scope">
                    Protect My Choices
                  </a>
                  <!-- ngIf: theme!=='apda' -->                  <a ng-if="theme!=='apda'" translate="" href="https://youradchoices.com/choices-faq#jr02" target="_blank" class="ng-scope">
                    Understanding How Online Advertising Works
                  </a>
                  <!-- end ngIf: theme!=='apda' -->                  <!-- ngIf: theme!=='apda' -->                  <a ng-if="theme!=='apda'" translate="" href="https://youradchoices.com/choices-faq" target="_blank" class="ng-scope">
                    Frequently Asked Questions
                  </a>
                  <!-- end ngIf: theme!=='apda' -->                  <a translate="" href="http://www.youradchoices.com" target="_blank" class="ng-scope">
                    Visit YourAdChoices
                  </a>
                  <a translate="" href="https://youradchoices.com/principles" target="_blank" class="ng-scope">
                    DAA Principles
                  </a>
                  <a translate="" href="https://youradchoices.com/learn#zone-preface3-wrapper" target="_blank" class="ng-scope">
                    Report a Problem
                  </a>
                  <!-- ngIf: theme!=='apda' -->                  <a ng-if="theme!=='apda'" translate="" href="https://youradchoices.com/choices-faq#jr10" target="_blank" class="ng-scope">
                    Help with WebChoices
                  </a>
                  <!-- end ngIf: theme!=='apda' -->                  <a translate="" href="http://youradchoices.com/privacy-policy?language=en" target="_blank" class="ng-scope">
                    DAA Privacy Policy
                  </a>
                  <a translate="" href="http://youradchoices.com/terms-of-use" target="_blank" class="ng-scope">
                    DAA Terms of Service
                  </a>
                </li>
              </ul>
              <span class="language-menu language-menu__desktop">
                <!-- ngIf: client === 'daa' -->                <span ng-if="client === 'daa'" class="dropdown nav navbar-nav navbar-right ng-scope">
                  <a data-toggle="dropdown">
                    <span class="glyphicon glyphicon-globe language-menu__icon" aria-hidden="true">
                    </span>
                  </a>
                  <ul class="dropdown-menu">
                    <li>
                      <a ng-hide="theme==='daac' &amp;&amp; lang==='fr'" href="" ng-click="changeLanguage('daac', 3,'fr')" class="">
                        Fran?ais Canadien
                      </a>
                      <a ng-hide="theme==='daac' &amp;&amp; lang==='en'" href="" ng-click="changeLanguage('daac', 3,'en')" class="">
                        Canadian English
                      </a>
                      <a ng-hide="theme==='apda'" href="" ng-click="changeLanguage('apda', 4,'es')" class="">
                        Espa?ol Latinoamericano
                      </a>
                      <a ng-hide="theme==='daa' &amp;&amp; lang==='en'" href="" ng-click="changeLanguage('daa', 2, 'en')" class="">
                        US English
                      </a>
                      <a ng-hide="theme==='daa' &amp;&amp; lang==='es'" href="" ng-click="changeLanguage('daa', 2, 'es')" class="">
                        Espa?ol de Estados Unidos
                      </a>
                    </li>
                  </ul>
                </span>
                <!-- end ngIf: client === 'daa' -->              </span>
            </li>
            <!-- end ngSwitchWhen: -->            <!-- ngSwitchWhen: true -->            <!-- ngSwitchWhen: true -->            <!-- ngSwitchWhen: true -->          </ul>
        </div>
      </div>
    </div>
    <div class="container">
      <!--[if lt IE 9]>
        <div class="browserupgrade">
                <p translate="global.browsehappy">You are using an <strong>outdated</strong> browser. Please <a href=\"http://browsehappy.com/?locale=en\">upgrade your browser</a> to improve your experience.</p>
        </div>
        <![endif]-->      <div id="nojavascript" style="display: none;">
        <h1>
          JavaScript Not Detected
        </h1>
        <p>
          The site requires that you enable JavaScript for your browser. Please turn on JavaScript for your browser and click "Try Again" to proceed.
        </p>
        <p>
          <a href="" class="btn btn-primary" role="button">
            Try Again
          </a>
           -
          <a href="" class="btn btn-info" role="button">
            Learn More
          </a>
        </p>
      </div>
      <!-- ngView: -->      <div class="footer">
      </div>
    </div>
    <script type="text/javascript">
//<![CDATA[
// Fake noscript tag to avoid Chrome bug
        document.getElementById('nojavascript').style.display = 'none';
//]]>
    </script>
    <script src="/scripts/600fbcb3.home.js">
    </script>
  </body>
</html>
cykuo3 commented 4 years ago

I also encountered this problem, helpppppp

twendelmuth commented 4 years ago

Well ... this is a bit difficult. HTMLUnit successfully renders the page and then I'd assume it's not executing the javascript that is supposed to run. It most likely doesn't trigger some javascript that is expected to be triggered. However the javascript of the page is 700 kb and not really meant to be read by humans ;-)

Do you guys know what should be triggered?

okeuday commented 4 years ago

@twendelmuth When I use curl, https://optout.aboutads.info/ is a 302 to https://optout.aboutads.info/?c=2&lang=EN which then provides similar HTML with two javascript files: https://optout.aboutads.info/scripts/3b40df82.shiv-home.js (at the top, 2639 bytes) https://optout.aboutads.info/scripts/600fbcb3.home.js (at the bottom, the 674 KiB one mentioned) Using the js-beautify command-line utility (on Linux), it is easy to format the javascript. The formatted files are now at https://gist.github.com/okeuday/23615961a18897b979ceb9052c6b73be .

That initial page is trying to do some HTML5 graphics with a progress bar and afterwards a dialog box that must be drawn over the content (while the javascript loads content underneath). So using a properly rendered page in HTMLUnit would require being able to click the dialog box button and other interaction. The output from HTMLUnit is similar to the output from curl and it seems like the javascript may be avoiding its changes to the HTML due to HTML5 checks, but I am not sure.