ariya / phantomjs

Scriptable Headless Browser
http://phantomjs.org
BSD 3-Clause "New" or "Revised" License
29.47k stars 5.76k forks source link

Operation canceled on some pages #12750

Closed Guigoz closed 4 years ago

Guigoz commented 9 years ago

Hello,

I'm experiencing some issues with phantomJS when trying to render some Urls. It works well for 99% of the requested urls but some are not working and i'm getting the "Operation canceled" error.

To reproduce the error you can use this simple script :

var webPage = require('webpage');
var page = webPage.create();

page.viewportSize = { width: 1920, height: 1080 };
page.open("http://www.thalesgroup.com", function start(status) {
  page.render('screenshot.jpeg', {format: 'jpeg', quality: '100'});
  phantom.exit();
});

It will produce the following result.

phantomjs-1.9.8-linux-x86_64/bin/phantomjs --debug=true --ignore-ssl-errors=yes --web-security=no --ssl-protocol=any simpleRender.js 
2014-11-17T15:18:32 [DEBUG] CookieJar - Created but will not store cookies (use option '--cookies-file=<filename>' to enable persisten cookie storage) 
2014-11-17T15:18:32 [DEBUG] Phantom - execute: Configuration 
2014-11-17T15:18:32 [DEBUG]      0 objectName : "" 
2014-11-17T15:18:32 [DEBUG]      1 cookiesFile : "" 
2014-11-17T15:18:32 [DEBUG]      2 diskCacheEnabled : "false" 
2014-11-17T15:18:32 [DEBUG]      3 maxDiskCacheSize : "-1" 
2014-11-17T15:18:32 [DEBUG]      4 ignoreSslErrors : "true" 
2014-11-17T15:18:32 [DEBUG]      5 localToRemoteUrlAccessEnabled : "false" 
2014-11-17T15:18:32 [DEBUG]      6 outputEncoding : "UTF-8" 
2014-11-17T15:18:32 [DEBUG]      7 proxyType : "http" 
2014-11-17T15:18:32 [DEBUG]      8 proxy : ":1080" 
2014-11-17T15:18:32 [DEBUG]      9 proxyAuth : ":" 
2014-11-17T15:18:32 [DEBUG]      10 scriptEncoding : "UTF-8" 
2014-11-17T15:18:32 [DEBUG]      11 webSecurityEnabled : "false" 
2014-11-17T15:18:32 [DEBUG]      12 offlineStoragePath : "" 
2014-11-17T15:18:32 [DEBUG]      13 offlineStorageDefaultQuota : "-1" 
2014-11-17T15:18:32 [DEBUG]      14 printDebugMessages : "true" 
2014-11-17T15:18:32 [DEBUG]      15 javascriptCanOpenWindows : "true" 
2014-11-17T15:18:32 [DEBUG]      16 javascriptCanCloseWindows : "true" 
2014-11-17T15:18:32 [DEBUG]      17 sslProtocol : "any" 
2014-11-17T15:18:32 [DEBUG]      18 sslCertificatesPath : "" 
2014-11-17T15:18:32 [DEBUG]      19 webdriver : ":" 
2014-11-17T15:18:32 [DEBUG]      20 webdriverLogFile : "" 
2014-11-17T15:18:32 [DEBUG]      21 webdriverLogLevel : "INFO" 
2014-11-17T15:18:32 [DEBUG]      22 webdriverSeleniumGridHub : "" 
2014-11-17T15:18:32 [DEBUG] Phantom - execute: Script & Arguments 
2014-11-17T15:18:32 [DEBUG]      script: "simpleRender.js" 
2014-11-17T15:18:32 [DEBUG] Phantom - execute: Starting normal mode 
2014-11-17T15:18:32 [DEBUG] WebPage - setupFrame "" 
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/fs.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/system.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/_coffee-script.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/../coffee-script/package.json" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/../coffee-script/./lib/coffee-script/coffee-script.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/../coffee-script/./lib/coffee-script/./lexer.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/../coffee-script/./lib/coffee-script/././rewriter.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/../coffee-script/./lib/coffee-script/././helpers.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/../coffee-script/./lib/coffee-script/./parser.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/../coffee-script/./lib/coffee-script/./helpers.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/../coffee-script/./lib/coffee-script/./nodes.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/../coffee-script/./lib/coffee-script/././scope.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/../coffee-script/./lib/coffee-script/./././helpers.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/../coffee-script/./lib/coffee-script/././lexer.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/../coffee-script/./lib/coffee-script/./././rewriter.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] FileSystem - _open: ":/modules/webpage.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:32 [DEBUG] WebPage - updateLoadingProgress: 10 
2014-11-17T15:18:35 [DEBUG] WebPage - updateLoadingProgress: 16 
2014-11-17T15:18:35 [DEBUG] WebPage - updateLoadingProgress: 19 
2014-11-17T15:18:35 [DEBUG] WebPage - updateLoadingProgress: 21 
2014-11-17T15:18:35 [DEBUG] WebPage - updateLoadingProgress: 27 
2014-11-17T15:18:35 [DEBUG] WebPage - updateLoadingProgress: 31 
2014-11-17T15:18:35 [DEBUG] WebPage - updateLoadingProgress: 37 
2014-11-17T15:18:35 [DEBUG] WebPage - updateLoadingProgress: 40 
2014-11-17T15:18:35 [DEBUG] WebPage - updateLoadingProgress: 42 
2014-11-17T15:18:35 [DEBUG] WebPage - setupFrame "" 
2014-11-17T15:18:35 [DEBUG] CookieJar - Saved "has_js=1; domain=www.thalesgroup.com; path=/" 
2014-11-17T15:18:36 [DEBUG] WebPage - updateLoadingProgress: 45 
2014-11-17T15:18:36 [DEBUG] CookieJar - Saved "has_js=1; domain=www.thalesgroup.com; path=/" 
2014-11-17T15:18:36 [DEBUG] CookieJar - Saved "context_breakpoints=none; domain=www.thalesgroup.com; path=/" 
2014-11-17T15:18:36 [DEBUG] WebPage - updateLoadingProgress: 100 
2014-11-17T15:18:36 [DEBUG] Network - Resource request error: 5 ( "Operation canceled" ) URL: "https://secure.leadforensics.com/Track/Capture.aspx?trk_user=26055&trk_sw=1024&trk_sh=768&trk_ref=&trk_tit=Thales Group&trk_loc=https://www.thalesgroup.com/en&trk_agn=Netscape&trk_agv=Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.9.8 Safari/534.34.lfcd32.lflngfr-FR&trk_dom=www.thalesgroup.com&trk_guid=1e390c06-f5a3-41e8-bc51-d4a46952b54e&trk_cookie=NA" 
2014-11-17T15:18:36 [DEBUG] Network - Resource request error: 5 ( "Operation canceled" ) URL: "https://www.google-analytics.com/analytics.js" 
2014-11-17T15:18:36 [DEBUG] Network - Resource request error: 5 ( "Operation canceled" ) URL: "https://www.thalesgroup.com/sites/default/files/js/js_3gTGsabd1RPSpnBDnrFkHoY_7DsRs04arZaXhxjiyHY.js" 
2014-11-17T15:18:36 [DEBUG] WebPage - updateLoadingProgress: 10 
2014-11-17T15:18:36 [DEBUG] WebPage - setupFrame "" 
2014-11-17T15:18:36 [DEBUG] FileSystem - _open: ":/modules/fs.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:36 [DEBUG] FileSystem - _open: ":/modules/system.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:36 [DEBUG] FileSystem - _open: ":/modules/_coffee-script.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:36 [DEBUG] FileSystem - _open: ":/modules/webpage.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:36 [DEBUG] WebPage - updateLoadingProgress: 10 
2014-11-17T15:18:36 [DEBUG] WebPage - setupFrame "" 
2014-11-17T15:18:36 [DEBUG] FileSystem - _open: ":/modules/fs.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:36 [DEBUG] FileSystem - _open: ":/modules/system.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:36 [DEBUG] FileSystem - _open: ":/modules/_coffee-script.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:36 [DEBUG] FileSystem - _open: ":/modules/webpage.js" QMap(("mode", QVariant(QString, "r") ) )  
2014-11-17T15:18:36 [DEBUG] WebPage - updateLoadingProgress: 100 
2014-11-17T15:18:36 [DEBUG] WebPage - setupFrame "" 
2014-11-17T15:18:36 [DEBUG] Network - Resource request error: 5 ( "Operation canceled" ) URL: "https://www.thalesgroup.com/en" 
2014-11-17T15:18:36 [DEBUG] WebPage - updateLoadingProgress: 100 
2014-11-17T15:18:36 [DEBUG] WebPage - updateLoadingProgress: 10 
2014-11-17T15:18:36 [DEBUG] WebPage - setupFrame "" 
2014-11-17T15:18:36 [DEBUG] WebPage - updateLoadingProgress: 100 
2014-11-17T15:18:36 [DEBUG] WebPage - setupFrame "" 
2014-11-17T15:18:36 [DEBUG] CookieJar - Purged (session) "context_breakpoints=none; domain=www.thalesgroup.com; path=/" 
2014-11-17T15:18:36 [DEBUG] CookieJar - Purged (session) "has_js=1; domain=www.thalesgroup.com; path=/" 

Thanks for your help. Guillaume.

bprodoehl commented 9 years ago

I see the same thing with a build of PhantomJS 2.0. Given the nature of Thales Group, perhaps they just have some safeguards in place against this sort of scraping.

grooveek commented 9 years ago

That's not some sort of special protection. Curl works just fine with no additional configuration, even the UA String. Stracing the process didn't gave me any advice. Perhaps someone knows a good way to debug PhantomJS website-related bugs...

thoop commented 9 years ago

I'm experiencing this as well with Linux x86_64 PhantomJS 1.9.8 and PhantomJS 2.0. It doesn't happen all the time, but I'll get an "Operation canceled" on the request with id === 1.

vinnitu commented 9 years ago

Network - Resource request error: 5 ( "Operation canceled" ) I have such trouble too (dev 2.0.0)

entrity commented 9 years ago

:+1:

maximilianredt commented 9 years ago

same here...

vinnitu commented 9 years ago

I cannot understand what is really trouble? I made simple app on qt5 with Qt5.4 and QWebView and my url loaded successfull.

tgt commented 9 years ago

I also have the same issue with various websites. For example, http://www.bbc.co.uk/ fails when fetching it through an HTTP proxy, but can be fetched successfully without a proxy.

Is ther any information I can provide that'll help you diagnose the issue?

arikkfir commented 9 years ago

Happened to me as well. I had a page containing multiple IFRAMEs, and the script attempted a series of requests by clicking links in one IFRAME, causing a refresh on another IFRAME. Each "click" on a link in the menu IFRAME was preceded by a call to "switchToFrame", and then the actual "click" on the link (using page.evaluate).

Once I added a "switchToMainFrame" right after the "page.evaluate", the problem disappeared for me.

sadasidha commented 9 years ago

Is this issue solved yet? I am getting the same error

vclteam commented 9 years ago

i also getting it on some pages

mattlyons0 commented 8 years ago

I am also getting this issue about 50% of the time my script runs. My script deals with a website with around 25 iframes and am using switchToFrame multiple times to get into nested iframes.

Oddly enough the onResourceError url is always a image. However it may be because the website loads mostly images with ajax the the probability of getting another resource is low.

Note, I'm on Arch Linux with phantomJS 2.0

sadasidha commented 8 years ago

@mattlyons0 Exactly, I am facing the same issue, which is leading me to lots of unnecessary error.

sintanial commented 8 years ago

I have the same trouble, on many pages. This trouble reproduced on versions: 1.9.8, 2.0.0, 2.0.1-dev. And sometimes i catch another problem "Segmentation fault" :((

Anybody have idea how to solve this problem ? I suffer 2 weeks with this bugs

ParthBarot-BoTreeConsulting commented 8 years ago

Me also facing same issue, it stops at such error Resource request error: 5 ( "Operation canceled" ) URL: "https://www.google-analytics.com/analytics.js"

apelican commented 8 years ago

I am getting this as well - Phantom 2.0.0 OSX 10.10.5 2015-12-18T16:51:17 [DEBUG] Network - Resource request error: 5 ( "Operation canceled" ) URL: "http://www.google.com/uds/?file=visualization&v=1.0&packages=corechart"

Any advice or feedback?

harrismaan commented 8 years ago

Hello guys, I am facing similar issues and it seems to be related to redirection, since direct loading of the url works. Any idea why this could be happening?

Dexus77 commented 8 years ago

I have this error - PhantomJS 2.0.0, Centos 7 [DEBUG - 2015-12-28T13:02:29.231Z] Session [3959c460-ad63-11e5-81ae-c570b6e5ae25] - page.onResourceError - {"errorCode":5,"errorString":"Operation canceled","id":57,"status":null,"statusText":null,"url":"https://mc.yandex.ru/webvisor/14067460?rn=398363363&page-url=http%3A%2F%2Fjob.ukr.net%2F&wmode=0&wv-type=0&wv-hit=260517100&wv-part=2&wv-check=3341&browser-info=z%3A120%3Ai%3A20151228150224%3Arqnl%3A1%3Ast%3A1451307749%3Au%3A145130774598162282"}

listen-lavender commented 8 years ago

any solution with this problem?

apelican commented 8 years ago

After some tinkering, my root cause was initiating async work after the complete event has been fired. I'm not sure if there's a better time in the lifecycle to try to bring in async resources (I was trying to load Google Charts, since it doesn't allow me to inline the JavaScript library). I wound up setting the page content to include the resource request as a script tag and this issue stopped for me, but your milage and root cause may vary. My hunch is this is open requests getting killed after the renderer is done.

tjoneseng commented 8 years ago

I am seeing this when the page loads Javascript through the CloudFront CDN. When Javascript is loaded from the same host I don't get the "operation canceled" errors.

csman commented 8 years ago

I switched from phantomjs 1.9 to 2.1 and I'm getting murdered by this error. No websites work. Zero. Every website dies with this:

2016-02-19T07:41:00 [DEBUG] Network - Resource request error: QNetworkReply::NetworkError(OperationCanceledError) ( "Operation canceled" )

I updated QT to 5.5, same thing. Anyone??

zackw commented 8 years ago

None of us devs have been able to reproduce this bug. It must be caused by something about the environment in which y'all are running the tests. We are going to need you to tinker with your environment and try to find out what triggers it. Unfortunately, I don't have the least idea what could be the problem.

Please note that PhantomJS is statically linked against Qt (we have to do this, our copy of Qt is nontrivially modified) and therefore upgrading the system copy of Qt will never make anything change. 2.1 is already using Qt 5.5.

smurf667 commented 8 years ago

Hello,

we too have been struggling a lot with this issue. We're on RHEL using PhantomJS 2.1.1. The application is AngularJS-based with require loading lots of modules simultaneously. The problem for us always occurred at startup when one (random) module failed to load due to the cancelled network operation, breaking the app. I liked the theory put forth by apelican, however when I tried to put a long-loading image request (5s) into the markup it did not help. On the contrary, afterwards I observed complete hangers. So it is probably not the renderer killing pending requests.

However, some observations and a workaround for our case: Not sure where exactly the issue lies, maybe a race condition or a concurrent request overload? The problem often occurred when running in a setup where Tomcat 7 was serving the app, testing code (Java WebDriver) and hosted app on the same machine. When configuring an HTTP proxy for the PhantomJS driver with --proxy the problem never occurred (this is our workaround: we a start a Jetty-based HTTP proxy and route all requests through it - we have not seen any aborted requests). We also used the Paros proxy experimentally and did not observe the problem. Either the proxy slows things down sufficiently (if it is a race condition), or the PhantomJS code takes a different non-problematic route for handling requests. The problem also did not occur when we used the simple Python HTTP server to serve the app from the file system (instead of through Tomcat).

We've analysed the logs of the app server and can only conclude that in problem cases the requests never reached the app server.

Hope this helps...

kristianoye commented 8 years ago

I am getting the same behavior. I am able to make about 5 successful page requests. On my 6th request, however, I get a resource error and all subsequent requests fail with Error Code 5 / "Operation Cancelled". Is this the same issue? Can I clear out previous errors?

ghost commented 8 years ago

Hi. Please try this - https://github.com/ariya/phantomjs/issues/10389#issuecomment-103650123

PhantomJS does not correct processes some of redirects

cajus commented 8 years ago

I am getting the same. But only on travis-ci. The tests are running fine locally with the same static version of phantomjs (latest). On travis, I'm getting random failures like this (complete log):

Runing tests for qx.test.bom.History
2016-03-15T09:48:31 [DEBUG] WebPage - updateLoadingProgress: 10
2016-03-15T09:48:31 [DEBUG] WebPage - updateLoadingProgress: 50
2016-03-15T09:48:31 [DEBUG] WebPage - setupFrame ""
000049 qx.core.Init: Load runtime: 49ms
000058 Waiting for tests
2016-03-15T09:48:31 [DEBUG] WebPage - setupFrame "<!--framePath //<!--frame0-->-->"
2016-03-15T09:48:31 [DEBUG] WebPage - updateLoadingProgress: 50
000061 Loading tests...
000061 qx.core.Init: Main runtime: 12ms
000063 qx.core.Init: Finalize runtime: 1ms
2016-03-15T09:48:31 [DEBUG] WebPage - updateLoadingProgress: 50
2016-03-15T09:48:31 [DEBUG] WebPage - updateLoadingProgress: 60
2016-03-15T09:48:31 [DEBUG] WebPage - updateLoadingProgress: 69
2016-03-15T09:48:31 [DEBUG] WebPage - updateLoadingProgress: 90
2016-03-15T09:48:31 [WARNING] QIODevice::write (QTcpSocket): device not open
2016-03-15T09:48:31 [DEBUG] WebPage - updateLoadingProgress: 90
2016-03-15T09:48:32 [DEBUG] WebPage - updateLoadingProgress: 90
2016-03-15T09:48:32 [DEBUG] WebPage - updateLoadingProgress: 90
2016-03-15T09:48:32 [DEBUG] WebPage - updateLoadingProgress: 90
2016-03-15T09:48:32 [DEBUG] WebPage - updateLoadingProgress: 90
2016-03-15T09:48:32 [DEBUG] WebPage - updateLoadingProgress: 90
2016-03-15T09:48:32 [DEBUG] WebPage - updateLoadingProgress: 90
2016-03-15T09:48:33 [DEBUG] Network - Resource request error: QNetworkReply::NetworkError(OperationCanceledError) ( "Operation canceled" ) URL: "http://127.0.0.1:31324/framework/source/class/qx/data/SingleValueBinding.js?nocache=0.9663659152574837"
2016-03-15T09:48:33 [DEBUG] WebPage - updateLoadingProgress: 100

Here is the script I'm using to do the run.

@zackw : I'm not familar with QT network programming at all. What does "device not open" mean and what can be the cause? Is there anything that I can do to debug the problem?

zackw commented 8 years ago

@cajus I am not familiar with Qt network programming at all myself. I found that error message in the code, and it appears to be generated when something tries to write to a network socket that hadn't been fully initialized; however, that doesn't tell me who did that or why or how the socket got into that state. It smells to me like Webkit is not talking to Qt correctly -- but without a reliable reproducer that works on my machine this is as far as I can go.

cajus commented 8 years ago

@zackw ok ;-) Sounded like you were involved into that stuff. If I could reproduce it locally, there would be much more ways to debug than on that remote docker stuff without any access. Lets see...

wskorodecki commented 8 years ago

I am getting this as well. PhantomJS v2.1.1. Ubuntu 14.04 64bit. CasperJS v1.1.0-beta5. Below is simple script showing this behaviour.

Please set correct email & password before testing. http://pastebin.com/bRRRkvDa

Command: casperjs --engine=phantomjs --ignore-ssl-errors=yes --ssl-protocol=any --debug=true bug-test.js

After switching to --engine=slimerjs it works.

@cajus I hope it will help you.

mepard commented 8 years ago

FWIW, I'm getting this, too. It's happening with POSTs to a FuseAction, but not all the time. It happens with current 2.2.0-development built locally, but does not happen with 1.9.8. It happens on Windows 10, but does not happen on OS X 10.11. I'm using current casperjs master built locally. When it happens, execution continues for a while, but it eventually results in a casper step timeout and gets stuck.

I have WireShark captures of it happening and not happening and I'm still investigating.

mepard commented 8 years ago

I have an idea what's happening in my case. I believe the cancelled operations are pending when navigation to a new page is requested. This makes sense and shouldn't be a problem, but it looks like it occasionally cancels a POST after the headers are written and before the content. I can see the headers in Wireshark, but none of the content. The stream stays open until much later when the process is terminated, at which point the client does a FIN and the server replies with a 504 GATEWAY_TIMEOUT with Connection: Close.

The process terminates after a later page load never completes and a casper step times out. My guess is that it's trying to reuse the kept-alive HTTP connection and can't write the new request because the old request hasn't finished, so it times out.

@Vitallium Is it plausible that a Qt HTTP stream could behave this way (cancel an HTTP request between the headers and content, then get stuck trying to use the stream for a future request)? I'm heading into the source, but I'd appreciate your thoughts.

I've attached a packet capture with an example.

cancel and gateway timeout.pcapng.zip

gsouf commented 8 years ago

@zackw I made some investigations and found something that might be interesting for you.

I could actually reproduce the error.

I cant tell if it is what happens in every situation, but that basically works for me. (or should I tell fail? :smile:)

Everything is about the setting headers. More specifically it will fail when setting the Host header.

TLDR: Removing the host header will work everytime!

A Working example

Executing phantomjs with this file will work:

"use strict";
var page = require('webpage').create();
var lastResourceError = null;

page.onResourceError = function (resourceError) {
    console.log("Url errored: " + resourceError.url);
    console.log("message : " + resourceError.errorString);
    console.log("------")
};

var settings = {
}

var url = "https://www.wikipedia.org/";

page.open(url, settings, function (status) {
    if (status !== 'success') {
        console.error('Error');
        phantom.exit(1);
    } else {
        console.log(page.plainText);
        phantom.exit();
    }
});

Reproduce the error

now replace the settings variable to this:

var settings = {
    "headers": {
        "Host": "wikipedia.org"
    }
}

var url = "https://www.wikipedia.org/";

And execute that will fail

Using a different host will work

Now replace the setting to a different host, and that will work (though the server returns a domain error):

var settings = {
    "headers": {
        "Host": "google.com"
    }
}

var url = "https://www.wikipedia.org/";

Host redirection will also fail

Querying google with different existing tld will fail (google redirects to the tld in the host)

var settings = {
    "headers": {
        "Host": "google.fr"
    }
}

var url = "https://www.google.com/";

(redirect to google.fr then fails)

Some cases willl work

I found a case in which it works: with the website getcomposer.org. Maybe is it due to what resources are loaded or not on the page.

The two following examples work:

var settings = {
}

var url = "https://getcomposer.org";
var settings = {
    "headers": {
        "Host": "getcomposer.org"
    }
}

var url = "https://getcomposer.org";

But for some reasons this will also fail (operation cancelled)

var settings = {
    "headers": {
        "Host": "foo.bar"
    }
}

var url = "https://getcomposer.org";

As a side note for this last example, I tried it in chrome leads to a loop redirection.

Page with no resource

Last note: trying this on a page that does not load much resources (http://httpbin.org/get) seems to work in every cases.

zackw commented 8 years ago

@gsouf Screwing with the Host header is a weird thing to do, I suspect Qt can't be persuaded to support it, and I doubt that other people seeing this problem are doing it. I filed #14164 so we can discuss it separately.

@Guigoz et al.: Are you screwing with the Host header? If so, please explain why.

fannahlin commented 8 years ago

I'm getting the same. I can reproduce this error almost all the time with some urls~
This urls will request another link for data. And that link for data will always failed with this error~

tsirolnik commented 8 years ago

Same thing happening for me too. Using node:argon container (debian) and node-horseman.
This fucks up my whole app. Also, this throws me mongoose's Error 403: Directory Listing Denied

marktheunissen commented 8 years ago

Here is an absolutely trivial example that Phantom cannot deal with:

<html>
<head>
  <script>
        var a = document.createElement("a");
        a.href = '/target.php';
        a.click();
    </script>
</head>
<body>Index page</body>
</html>

You would expect the browser to be redirected to target.php. This works in Chrome, for example.

Phantom does this (notice Network - Resource request error: QNetworkReply::NetworkError(OperationCanceledError) ( "Operation canceled" ) URL: "http://dev-phantom-failure-sample.pantheonsite.io/target.php"):

I have hosted this example here for testing: http://dev-phantom-failure-sample.pantheonsite.io/index.php

2016-06-10T12:01:07 [DEBUG] CookieJar - Created but will not store cookies (use option '--cookies-file=<filename>' to enable persistent cookie storage)
2016-06-10T12:01:07 [DEBUG] Set  "http"  proxy to:  "" : 1080
2016-06-10T12:01:07 [DEBUG] Phantom - execute: Configuration
2016-06-10T12:01:07 [DEBUG]      0 objectName : ""
2016-06-10T12:01:07 [DEBUG]      1 cookiesFile : ""
2016-06-10T12:01:07 [DEBUG]      2 diskCacheEnabled : "false"
2016-06-10T12:01:07 [DEBUG]      3 maxDiskCacheSize : "-1"
2016-06-10T12:01:07 [DEBUG]      4 diskCachePath : ""
2016-06-10T12:01:07 [DEBUG]      5 ignoreSslErrors : "false"
2016-06-10T12:01:07 [DEBUG]      6 localUrlAccessEnabled : "true"
2016-06-10T12:01:07 [DEBUG]      7 localToRemoteUrlAccessEnabled : "false"
2016-06-10T12:01:07 [DEBUG]      8 outputEncoding : "UTF-8"
2016-06-10T12:01:07 [DEBUG]      9 proxyType : "http"
2016-06-10T12:01:07 [DEBUG]      10 proxy : ":1080"
2016-06-10T12:01:07 [DEBUG]      11 proxyAuth : ":"
2016-06-10T12:01:07 [DEBUG]      12 scriptEncoding : "UTF-8"
2016-06-10T12:01:07 [DEBUG]      13 webSecurityEnabled : "true"
2016-06-10T12:01:07 [DEBUG]      14 offlineStoragePath : ""
2016-06-10T12:01:07 [DEBUG]      15 localStoragePath : ""
2016-06-10T12:01:07 [DEBUG]      16 localStorageDefaultQuota : "-1"
2016-06-10T12:01:07 [DEBUG]      17 offlineStorageDefaultQuota : "-1"
2016-06-10T12:01:07 [DEBUG]      18 printDebugMessages : "true"
2016-06-10T12:01:07 [DEBUG]      19 javascriptCanOpenWindows : "true"
2016-06-10T12:01:07 [DEBUG]      20 javascriptCanCloseWindows : "true"
2016-06-10T12:01:07 [DEBUG]      21 sslProtocol : "default"
2016-06-10T12:01:07 [DEBUG]      22 sslCiphers : "ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-RC4-SHA:ECDHE-RSA-RC4-SHA:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:RC4-SHA:RC4-MD5"
2016-06-10T12:01:07 [DEBUG]      23 sslCertificatesPath : ""
2016-06-10T12:01:07 [DEBUG]      24 sslClientCertificateFile : ""
2016-06-10T12:01:07 [DEBUG]      25 sslClientKeyFile : ""
2016-06-10T12:01:07 [DEBUG]      26 sslClientKeyPassphrase : ""
2016-06-10T12:01:07 [DEBUG]      27 webdriver : ":"
2016-06-10T12:01:07 [DEBUG]      28 webdriverLogFile : ""
2016-06-10T12:01:07 [DEBUG]      29 webdriverLogLevel : "INFO"
2016-06-10T12:01:07 [DEBUG]      30 webdriverSeleniumGridHub : ""
2016-06-10T12:01:07 [DEBUG] Phantom - execute: Script & Arguments
2016-06-10T12:01:07 [DEBUG]      script: "phantom/phantom_screenshot.js"
2016-06-10T12:01:07 [DEBUG]      0 arg: "1000"
2016-06-10T12:01:07 [DEBUG]      1 arg: "800"
2016-06-10T12:01:07 [DEBUG]      2 arg: "http://dev-phantom-failure-sample.pantheonsite.io/index.php"
2016-06-10T12:01:07 [DEBUG]      3 arg: "test.png"
2016-06-10T12:01:07 [DEBUG] Phantom - execute: Starting normal mode
2016-06-10T12:01:07 [DEBUG] WebPage - setupFrame ""
2016-06-10T12:01:07 [DEBUG] FileSystem - _open: ":/modules/fs.js" QMap(("mode", QVariant(QString, "r")))
2016-06-10T12:01:07 [DEBUG] FileSystem - _open: ":/modules/system.js" QMap(("mode", QVariant(QString, "r")))
2016-06-10T12:01:07 [DEBUG] FileSystem - _open: ":/modules/webpage.js" QMap(("mode", QVariant(QString, "r")))
2016-06-10T10:01:07 [INFO] Initial URL: http://dev-phantom-failure-sample.pantheonsite.io/index.php
2016-06-10T10:01:07 [INFO] Target file: test.png
2016-06-10T10:01:07 [INFO] No Basic Auth in environment
2016-06-10T12:01:07 [DEBUG] WebPage - updateLoadingProgress: 10
2016-06-10T12:01:08 [DEBUG] CookieJar - Saved "NO_CACHE=1; domain=dev-phantom-failure-sample.pantheonsite.io; path=/"
2016-06-10T12:01:08 [DEBUG] WebPage - updateLoadingProgress: 30
2016-06-10T10:01:08 [INFO] URL changed to: http://dev-phantom-failure-sample.pantheonsite.io/index.php
2016-06-10T12:01:08 [DEBUG] WebPage - setupFrame ""
2016-06-10T12:01:08 [DEBUG] WebPage - updateLoadingProgress: 100
2016-06-10T12:01:08 [DEBUG] WebPage - updateLoadingProgress: 10
phantomjs failed to reach url2016-06-10T12:01:08 [DEBUG] WebPage - setupFrame ""
2016-06-10T12:01:08 [DEBUG] Network - Resource request error: QNetworkReply::NetworkError(OperationCanceledError) ( "Operation canceled" ) URL: "http://dev-phantom-failure-sample.pantheonsite.io/target.php"
2016-06-10T12:01:08 [DEBUG] WebPage - updateLoadingProgress: 100
2016-06-10T12:01:08 [DEBUG] WebPage - updateLoadingProgress: 10
2016-06-10T10:01:08 [INFO] URL changed to: about:blank
2016-06-10T12:01:08 [DEBUG] WebPage - setupFrame ""
2016-06-10T12:01:08 [DEBUG] WebPage - updateLoadingProgress: 100
2016-06-10T12:01:08 [DEBUG] WebPage - setupFrame ""
2016-06-10T12:01:08 [DEBUG] FileSystem - _open: ":/modules/fs.js" QMap(("mode", QVariant(QString, "r")))
2016-06-10T12:01:08 [DEBUG] FileSystem - _open: ":/modules/system.js" QMap(("mode", QVariant(QString, "r")))
2016-06-10T12:01:08 [DEBUG] FileSystem - _open: ":/modules/webpage.js" QMap(("mode", QVariant(QString, "r")))
2016-06-10T12:01:08 [DEBUG] WebPage - updateLoadingProgress: 10
2016-06-10T12:01:08 [DEBUG] WebPage - setupFrame ""
2016-06-10T12:01:08 [DEBUG] FileSystem - _open: ":/modules/fs.js" QMap(("mode", QVariant(QString, "r")))
2016-06-10T12:01:08 [DEBUG] FileSystem - _open: ":/modules/system.js" QMap(("mode", QVariant(QString, "r")))
2016-06-10T12:01:08 [DEBUG] FileSystem - _open: ":/modules/webpage.js" QMap(("mode", QVariant(QString, "r")))
2016-06-10T12:01:08 [DEBUG] WebPage - updateLoadingProgress: 100
2016-06-10T10:01:08 [INFO] Document ready, waiting: 3000 ms before taking screenshot
2016-06-10T12:01:08 [DEBUG] WebPage - setupFrame ""
2016-06-10T12:01:08 [DEBUG] CookieJar - Purged (session) "NO_CACHE=1; domain=dev-phantom-failure-sample.pantheonsite.io; path=/"
vijay22sai commented 8 years ago

I am also getting this error continuosly error code: 5, description: Operation canceled

I fixed it by closing the page and recreating page whenever i visit another website.

outboundexplorer commented 8 years ago

I have a very "hacky" workaround for this. I am not quite sure why this works or more to the point why it is failing in the first place, but what I can see happening is the following:

I am opening a page and it is going off and fetching all the resources that it needs. Once the Page.onLoadFinished event is called, I am injecting a script that is doing all my DOM work. If for whatever reason, one of the resources that the page is loading fails before my DOM script has finished, then the script will abort and I will not get the data that I need.

After studying the actual resources that are causing the onResourceError event, I seem to have so far tracked it down to a few common culprits. At the moment "adzerk.net, facebook.com, doubleclick, adition.com and quantserve.com". As these are the ones that keep coming up, i have used the onResourceRequest event to redirect the url and simply ping a blank page on my website. This way the flow continues and my injected script is able to finish without any problems. I have also noticed that if I give the server enough juice and shut everything else down, then usually my script can finish before there is an issue (without the need to redirect the url).

Sorry if this seems to be a bit of a cowboy approach, but my first impressions are that this is something to do with authentication cookies. I don't really have any idea why they would not be processed in the same way a normal web-page is. Hope this makes sense :)

My feeling is that this bug should be reproducible if the injected script is fairly heavy and takes a while, allowing one of the resources such as the "adzerk" resources that are loaded in this page fail:

http://stackoverflow.com/questions/1200214/how-can-i-measure-the-speed-of-code-written-in-php

foby commented 8 years ago

Disabling web security fixed it for me (--web-security=false). I use phantomJS Version 2.1.1 with in-memory cache on an AWS EC2 linux machine.

bluepeter commented 8 years ago

After some tinkering, my root cause was initiating async work after the complete event has been [...] My hunch is this is open requests getting killed after the renderer is done.

^^ This.

RubenVerborgh commented 8 years ago

Certain header settings can indeed trigger this bug. I found a strange case.

This fails for certain domains:

page.settings.userAgent = USER_AGENT;

while this equivalent code works perfectly:

page.customHeaders = { 'User-Agent': USER_AGENT };

Failure only happens with certain domains, but consistently so.

It might have to do with header order, or with customHeaders being passed to sub-resources, but userAgent not?

RubenVerborgh commented 8 years ago

I wonder if it is in any way related to PhantomJS detection based on header order. I can make some requests work by manually setting the correct host header (related to this comment).

RubenVerborgh commented 8 years ago

I've created a test that aims to reproduce this error. It tries to load a subset of the Alexa top 500 websites through separate invocations of PhantomJS, and prints whether this works. I have chosen the subset such that it includes websites that have caused trouble on my machines before.

There are two variations of the test: A. open the page and report the status B. open the page twice and report the status of the second time

The results on my machine (OS X 10.9.5) are as follows:

test success failure …due to “operation canceled” crash
A 35 41 32 4
B 48 24 15 7

To test, clone the gist and execute ./run.

ariya commented 8 years ago

@RubenVerborgh I appreciate the test setup. While test volume is important, in this particular case, what's useful is a reduced test case that can be verified in a isolated environment. The previous excellent test case from @marktheunissen fits into this criteria, alas I haven't been able to reproduce it.

marktheunissen commented 8 years ago

@ariya Hey there, does my test case not reproduce the bug for you?

RubenVerborgh commented 8 years ago

@ariya The purpose of my test setup is not volume, but to find reproducible test cases on your environment.

The problem in this bug is that some sites cause an issue on some machines, and some sites do not. When you run my test, it will return the list of sites for which the bug occurs on your machine; you can continue with any failing site to create a reduced test case for your specific isolated environment.

So could you please try to run it to see whether the bug indeed occurs at least once? We could then continue to reduce the the failing test case for your setup.

bluepeter commented 8 years ago

This is a 2 year old bug. It's frustrating to see lack of progress on this due to test case requirements. @ariya maybe you can have a try with @RubenVerborgh solid approach at getting you a reproducible case?

dernasherbrezon commented 8 years ago

I was able to track this "bug". Here is events:

1) Resource requested. Async using ajax. 2) Then my script was saving result page to file and executed: phantom.exit(0); 3) "Operation canceled" on some resources.

This happens due to slow server responses. When I put my phantom.exit(0) into: window.setTimeout(function () {}, 10000); Everything went smoothly.

I suggest having some map of requested resources. Once resource received, remove it's url from that map. In "onLoadFinished" periodically wake up to check if map is empty.

RubenVerborgh commented 8 years ago

@dernasherbrezon That doesn't seem to apply to the original testcase. There, the sequence is:

  1. Resource X requested
  2. open callback returns "fail" (because "operation cancelled" already occurred)
  3. phantom.exit()

So a timeout will not help there.