baudehlo / node-phantom-simple

Simple bridge to phantomjs for Node
MIT License
201 stars 70 forks source link

node-phantom-simple is very slow #120

Open surendhar1988 opened 8 years ago

surendhar1988 commented 8 years ago

I am tracking network calls via onResourceReceived. But it was very slow to responding when huge external call request in particular site and sometimes response itself not came. It works well only for lower external calls.

When I ran the same site from local installed phantomjs. It works well. Please let me help how to improve the performance of node-phantom-simple module..

The site (http://citycentredeira.com) which I was tried not worked via node-phantom-simple. It works well via locally installed phantomjs.

Nodejs code below :

 driver.create({ path: require('phantomjs').path}, function(err, browser){
        console.log('Create phantomhs done');
        browser.createPage(function(err, page){
            console.log('Create page done');
            page.set('settings.userAgent', 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36', function(){
                console.log('User Agent settings done');
            });
            page.open(url, function(err, status){
                console.log('Open page done');
            });
            page.onResourceReceived = function(response) {
                //console.log('Response URL :' + splitDomain(response.url));
                var urldomain = splitDomain(response.url);
                if(urldomain.startsWith('http') && result.indexOf(urldomain) == -1) {
                    result.push(splitDomain(response.url));
                }
            };

            page.onLoadFinished = function(status) {
                console.log(' onLoadFinished Status: ' + status);
                console.log(result);
                page.close();
                //browser.exit();
                completeHandler(result);
            };
        });
    });
surendhar1988 commented 8 years ago

Hi,

Please let us know whether it won't work in the heavy loaded site. Is any alternative way to fix this.

Below sites are not able to receive the external domain details. It was taking the huge amount of time and finally went to timeout.

http://www.polygon.com/ http://www.theverge.com/