ariya / phantomjs

Scriptable Headless Browser
http://phantomjs.org
BSD 3-Clause "New" or "Revised" License
29.45k stars 5.76k forks source link

std::bad_alloc #10365

Closed xbeumala closed 12 years ago

xbeumala commented 12 years ago

xavi.beu...@gmail.com commented:

version: phantom 1.4.1 (and also master compiled from Master - 1.5)

What steps will reproduce the problem?

  1. Create a sample.js script: var page = new WebPage(); page.address = phantom.args[0];

page.open(page.address, function (status) { if (status === 'fail') { console.error("Fail loading page " + page.address); } else { console.log("done"); }

phantom.exit();

});

  1. Invoke: phantomjs sample.js sample.html
  2. where sample.html is a simple html file.

What is the expected output? What do you see instead? The execution outputs:

"terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

This application has requested the Runtime to terminate it in an unusual way.

Please contact the application's support team for more information."

Compiling a debug version:

2012-01-27T00:33:38 [WARNING] QIODevice::read: maxSize argument exceeds QByteArray size limit terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

This application has requested the Runtime to terminate it in an unusual way.

Please contact the application's support team for more information.

Which operating system are you using? It fails on Win7 64 Enterprise. Works fine on Ubuntu.

Did you use binary PhantomJS or did you compile it from source? Both compiled and binary distribution fail.

Disclaimer: This issue was migrated on 2013-03-15 from the project's former issue tracker on Google Code, Issue #365. :star2:   6 people had starred this issue at the time of migration.

aportale commented 12 years ago

alessandro.portale@gmail.com commented:

I tested 1.4.1-win32-dynamic with your sample.js and a simple html file on Windows Vista 32. I do not get the crash, but a "Failed loading page..". Reason for the failing is that phantom.args[0] is not the first parameter you pass to phantomjs.exe but instead it is the phantom.exe path. When changing your sample.js to use phantom.args[1], I get a "done".

However, args[] order aside, PhantomJS is definitely not supposed to crash when opening bogus HTML and I am curious what causes the QByteArray size limit to burst (esp. on a x64 system). Btw.: thanks a lot for the debug version output, it gives already some more hints.

Could do a console.log(phantom.args[0]) ? Perhaps the crash has something to do with the path?

aportale commented 12 years ago

alessandro.portale@gmail.com commented:

..just reading that Issue #359 and #360 have the same cause. On Windows7 x86 (32-bit). So, this is not about a possibly wrong args[].

xbeumala commented 12 years ago

xavi.beu...@gmail.com commented:

Hi Alessandro,

Not sure on Win32 but on Win64 and Ubuntu when executing "phantomjs.exe sample.js sample.html":

  1. phantom.args[0] >> sample.html
  2. phantom.args[1] >> undefined

When changing the script to load phantom.args[1] like you mention "page.address = phantom.args[1];" I also get the "done" message. But I'm highly confused because it's loading an "undefined" page, it's not loading the sample.html page.

Are you sure that args[1] should be the one to be used?

execjosh commented 12 years ago

I have no problem when I us (both cygwin and cmd.exe) a phantomjs.exe *2 I built on win7 64-bit with Qt Creator. However, directly using the phantomjs-1.4.1-win32-dynamic download phantomjs.exe *1, I get the std::bad_alloc error.

I deleted all but phantomjs.exe *1 and tried again. What do you know, it works perfectly! I guess that means it is using the DLLs on my path instead of the ones bundled in the zip.

So, my best guess is that this looks to be a 64- vs. 32-bit issue.

aportale commented 12 years ago

alessandro.portale@gmail.com commented:

xavi.beumala: Oops. please forget all that I said about the arg[] order. You are right that arg[0] contains the URL.

buildjosh: Thanks for the tests. Could it be that the just replacing the (perhaps outdated?) mingwm10.dll and libgcc_s_dw2-1.dll with those that are in your path would fix the issue?

xbeumala commented 12 years ago

xavi.beu...@gmail.com commented:

@buildjosh @alessandro I've tried to run phantom from QT Creator as @buildjosh says but I still have the same issue (same from CLI).

@buildjosh are you using QT4.8 and building from MASTER or 1.4.1 branch?

execjosh commented 12 years ago

@xbeumala: I built branches master and v1.4 (tag v1.4.1 doesn't build without modification on Windows). But, I'm still using Qt 1.7.4. When I look at the Qt Creator version dialog, though, it mentions 32-bit, which is odd considering that I installed it on a 64-bit Win7 box. Hmmm...

execjosh commented 12 years ago

Oops, that was a mistype. It should be Qt 4.7.4 (not 1.7.4)!

execjosh commented 12 years ago

@aportale: I checked the md5 hashes for those dlls. Mine are the same as those redistributed in phantomjs-1.4.1-win32-dynamic.zip.

$ echo $QT_VERSION
4.7.4

$ md5sum $QTDIR/bin/{mingwm10.dll,libgcc_s_dw2-1.dll} | cut -b-32 | sort | md5sum
e520bd82427d5c0e7358ca83dbdf3ee1 *-

$ md5sum phantomjs-1.4.1-win32-dynamic/{mingwm10.dll,libgcc_s_dw2-1.dll} | cut -b-32 | sort | md5sum
e520bd82427d5c0e7358ca83dbdf3ee1 *-
execjosh commented 12 years ago

After doing some tinkering, I've come to the conclusion that a change in QUrl for 4.8 has rendered it useless for paths that do not use the "file" scheme. PhantomJS opens the address in WebPage::openUrl() by passing it along to m_mainFrame->load(). That's where the trouble starts...

QWebFrame::load() calls tries to ensure that the URL is absolute. Eventually QUrl::toLocalFile() gets called, and, well, just have a look at the implementation: http://qt.gitorious.org/qt/qt/blobs/4.8/src/3rdparty/webkit/Source/WebKit/qt/Api/qwebframe.cpp

QUrl::isLocalFile() was introduced in this commit (a revert of a revert): http://qt.gitorious.org/qt/qt/commit/58fc06e9fe538eb824a3ffef74137628e1742608

And this is the problematic line: http://qt.gitorious.org/qt/qt/blobs/58fc06e9fe538eb824a3ffef74137628e1742608/src/corelib/io/qurl.cpp#line6073

Again, in the current tip of 4.8: http://qt.gitorious.org/qt/qt/blobs/6907626962f845ae5d713f4e8f6bfefdd15724f6/src/corelib/io/qurl.cpp#line6197

It should be checking also whether the scheme.isEmpty(), as is done in the current tip of 4.7: http://qt.gitorious.org/qt/qt/blobs/a96672bc77642e3e5a030fd9e78f700b47e11ccb/src/corelib/io/qurl.cpp#line6070

Anyway, I whipped up a little demonstration prog (attached). Can anyone give a second opinion? If you build it with 4.7, everything is happy. But, if you build it with 4.8, then the URLs will not match.

I think that this should probably be registered as a bug with the Qt project, though...

execjosh commented 12 years ago

I forgot to write the reason for the exception...

Basically, it ends up trying to read a directory (i.e., the current working directory) instead of the specified file. I think this happens because the URL normalization fails to properly identify empty-schemed local file paths, returning an empty string.

detro commented 12 years ago

detroniz...@gmail.com commented:

From the sound of it, this is a Qt bug :(

execjosh commented 12 years ago

Found a Qt bug report for this issue (via http://bugs.kde.org/285028)!

https://bugreports.qt-project.org/browse/QTBUG-22382

execjosh commented 12 years ago

It looks like the Qt team doesn't think this is a bug. So, I've written a workaround and submitted a pull request.

https://github.com/ariya/phantomjs/pull/200

ariya commented 12 years ago

ariya.hi...@gmail.com commented:

Good finding! It likely needs some unit testing in the our test suite.

I also think enclosing it with Qt version check is a good idea, just to be on the safe side.

 
Metadata Updates

aportale commented 12 years ago

alessandro.portale@gmail.com commented:

Interesting! Indeed good finding that this was related to fromLocalFile. There have been several discussions about the behavioral change of fromLocalFile in Qt 4.8.0, and it seems that the original behavior will after all be restored in 4.8.1, see http://lists.qt-project.org/pipermail/development/2012-February/001776.html

So, you could limit the version check exactly to 4.8.0 :)

execjosh commented 12 years ago

@ariya @aportale: Thanks for the comments :) I've added the #if.

@ariya: I'm not sure how to unit test this with JavaScript, because it basically crashes after the std::bad_alloc...

ariya commented 12 years ago

ariya.hi...@gmail.com commented:

Ah, I forgot that this is about crash. Surely that makes the testing is not trivial :(

I think I'll merge your fix as is. Beside this would be specific to 4.8.0 anyway.

ariya commented 12 years ago

ariya.hi...@gmail.com commented:

Landed: https://github.com/ariya/phantomjs/commit/9123e4b01f https://github.com/ariya/phantomjs/commit/32b3e06ab4

Thanks!

 
Metadata Updates

xbeumala commented 12 years ago

xavi.beu...@gmail.com commented:

Fantastic! many thanks to all

ariya commented 11 years ago

ariya.hi...@gmail.com commented:

Issue 360 has been merged into this issue.