johntitus / node-horseman

Run PhantomJS from Node
MIT License
1.45k stars 124 forks source link

Problem with big files? #212

Open ile opened 8 years ago

ile commented 8 years ago

This gives an error of Error: Failed to GET url: https://farm9.staticflickr.com/8422/28832766510_900709792b_k_d.jpg

var Horseman = require("node-horseman"),
    horseman = new Horseman({
    injectJquery: false,
    ignoreSSLErrors: true
});

horseman
    .userAgent('Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36')
    .viewport(1300, 1300)
    .open('https://s3.amazonaws.com/arena_images-temp/uploads%2Fr2vtrllc%2Fpilgrim-take-4.gif')
    .then(function(result) {
        console.log(result);
    })
    .catch(function(result) {
        console.error(result)
    });

If I use node-phantom-simpledirectly, it seems to work:

var driver = require('node-phantom-simple');

driver.create({ parameters: { 'ignore-ssl-errors': 'yes',  } }, function (err, browser) {
    return browser.createPage(function (err, page) {
        page.set('viewportSize', {
            width: 1500,
            height: 1200
        });

        page.set('userAgent', 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36');

        return page.open("https://s3.amazonaws.com/arena_images-temp/uploads%2Fr2vtrllc%2Fpilgrim-take-4.gif", function (err,status) {
            console.log(arguments);
            page.render('./example.jpg')
        });
    });
});

I haven't tracked the error further, so not 100 % sure it's in Horseman, but so far it seems to be so? And this seems to happen with big files (not sure how big, maybe over 1 MB).

johntitus commented 8 years ago

Thanks for the example. I do get a "Failed to get URL" for the first script. However, it appears to be a timeout issue.

If I change the first part of the script to 20 seconds, it seems to work fine.

var Horseman = require('./lib/index'),
    horseman = new Horseman({
    injectJquery: false,
    ignoreSSLErrors: true,
    timeout: 20000
});
ile commented 8 years ago

Right, thanks. Is there a possibility to change the error message to reflect that?

johntitus commented 8 years ago

It definitely should. Haven't looked into what's going on under the hood but I will.

awlayton commented 8 years ago

Unfortunately the .open in PhantyomJS simply returns 'success' or 'fail' and nothing about why it failed. It order to recognize it was caused by a timeout, horseman needs to register an onResourceTimeout callback and change its error message there.

I do not know when I will have time to implement this, but perhaps this will help someone else to write a fix.