johntitus / node-horseman

Run PhantomJS from Node
MIT License
1.45k stars 124 forks source link

Add support for polyfilling missing PhantomJS JavaScript features #230

Open awlayton opened 7 years ago

awlayton commented 7 years ago

PhantomJS does not support some features that websites expect it to. It would be nice for horseman to have built-in support for polyfilling these features.

Here is a way I came up with for polyfilling externally with Polyfill.io, for reference. It is somewhat gross, so if someone comes up with a better way that's great.

var Horseman = require('node-horseman');
var POLYURL = 'https://cdn.polyfill.io/v2/polyfill.min.js';

var horseman = new Horseman();

horseman
    .userAgent()
    .then(function(useragent) { // Get Polyfill based on real UserAgent
        return this.download(POLYURL + '?ua=' + useragent);
    })
    .then(function(polyfill) {
        // Make polyfill into function
        var polyfun = new Function(polyfill);
        // Wrap function in evaluateJavaScript
        var onInit = new Function(
            'console.log("onInitialized");' +
            'page.evaluateJavaScript(' + polyfun.toString() + ')'
        );
        // Call before page loads
        return this.at('initialized', onInit);
    })
    /* Do horseman stuff now */

Assistance/contribution would be greatly appreciated.

dickeylth commented 7 years ago

I've struggled for a while to try to inject polyfill scripts to page before any js is loaded, and finally I found this issue to inject js to page before onLoadFinished!

awlayton commented 7 years ago

Glad you found this useful @dickeylth.

If I ever have spare time to work on horseman again, I will implement support for this myself. However, who knows when that will happen. Maybe someone else will be able to do it.

mvrahden commented 7 years ago

@awlayton I'm experiencing an error with your implementation of the polyfilling. My setup is:

My code is as follows:

let Horseman = require('node-horseman');
let POLYURL = 'https://cdn.polyfill.io/v2/polyfill.min.js';

let horseman = new Horseman();
  horseman
    .userAgent()
    .then((useragent) => { // Get Polyfill based on real UserAgent
      return this.download(POLYURL + '?ua=' + useragent);
    })
    .then((polyfill) => {
      // Make polyfill into function
      let polyfun = new Function(polyfill);
      // Wrap function in evaluateJavaScript
      let onInit = new Function(
        'console.log("onInitialized");' +
        'page.evaluateJavaScript(' + polyfun.toString() + ')'
      );
      // Call before page loads
      return this.at('initialized', onInit);
    })
    .userAgent('Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Mobile Safari/537.36')
    .viewport(414, 736)
    .open('https://www.google.com/') // some website
/* and do some stuff */

the Error I'm experiencing is as follows:

Unhandled rejection TypeError: this.download is not a function

Do you, or any one else have clue how to correctly polyfill if it's currently not like you proposed last year? I'd be happy for help :)

awlayton commented 7 years ago

The problem is not my implementation @mvalipour. The problem is you changed all the functions to arrow functions, and this works differently with arrow functions.

mvrahden commented 7 years ago

Wow, my bad. Thanks for helping out so fast 🥇 @awlayton

I'm getting no more runtime error.

But still something related to the polyfilling doesn't seem to work -.- There are two observations:

  1. The code isn't firing the onInit-function at the initialized-Event. (But the second then-Statement is being executed.)
  2. I'm experiencing a ReferenceError: Can't find variable: WeakMap (es6 feature which I actually wanted to eliminate via polyfilling)

So, polyfill is definitely holding a bunch of minified javascript code.

Maybe you have any experience with that?

awlayton commented 7 years ago

The onInit function will be run within the PhantomJS context (which is neither NodeJS nor a browser window), so you will not see the output of that console.log unless you have a callback registered for when PhantomJS tries to print to the console (which I do not currently recall how to do).

As for the WeakMap problem, I have never tried to use them unfortunately. If possible, please post the actual code that produces the error.

mvrahden commented 7 years ago

@awlayton wrote you an email. :)