ariya / phantomjs

Scriptable Headless Browser
http://phantomjs.org
BSD 3-Clause "New" or "Revised" License
29.47k stars 5.75k forks source link

Cannot retrieve body using any release and cannot build on OSX El Capitan #13774

Closed vAlmaraz closed 8 years ago

vAlmaraz commented 9 years ago

I am trying to save all resource files downloaded when making a request. I have searched in all issues and found that body was introduced in "onResourceReceived" event in this commit: https://github.com/ariya/phantomjs/commit/434d4e0101a540525e8f89a657ea553fb38b040b

I have tried several versions: 1.7, 1.8, 1.9 and 2.0. I am unable to retrieve body using any of these versions, there is no body attribute.

In source code, body attribute is setted and filled when finished, so I have tried to build PhantomJS. When I run build_and_package.sh, first of all an error is displayed:

Building Qt and PhantomJS with debugging symbols. If you have previously built without debugging     symbols, you should run:

$ git clean -xdff

usage: build.py [-h] [-r] [-d] [-j JOBS] [-c] [-n] [-s]
            [--qmake-args QMAKE_ARGS]
            [--webkit-qmake-args WEBKIT_QMAKE_ARGS]
            [--phantomjs-qmake-args PHANTOMJS_QMAKE_ARGS]
            [--qt-config QT_CONFIG] [--git-clean-qtbase]
            [--git-clean-qtwebkit] [--skip-qtbase]
            [--skip-configure-qtbase] [--skip-qtwebkit]
            [--skip-configure-qtwebkit]
build.py: error: unrecognized arguments: --release-debug

After running git clean and getting same result, I edited the bash file to remove "-debug". It works but now, after a couple of commands, it throws this:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ar cq libQt5WebKit.a 
ar: no archive members specified
usage:  ar -d [-TLsv] archive file ...
    ar -m [-TLsv] archive file ...
    ar -m [-abiTLsv] position archive file ...
    ar -p [-TLsv] archive [file ...]
    ar -q [-cTLsv] archive file ...
    ar -r [-cuTLsv] archive file ...
    ar -r [-abciuTLsv] position archive file ...
    ar -t [-TLsv] archive [file ...]
    ar -x [-ouTLsv] archive [file ...]
make[2]: *** [/Users/victoralmaraz/Documents/repositorios/phantomjs/src/qt/qtbase/lib/libQt5WebKit.a] Error 1
make[1]: *** [sub-api-pri-make_first-ordered] Error 2
make: *** [sub-Source-QtWebKit-pro-make_first-ordered] Error 2

ERROR: Failed to build PhantomJS! Building Qt WebKit failed.
phantomjs was not built yet, please run build.sh first

If I execute build.py, the same error is thrown. Do I have an incompatible version of xCode?

astefanutti commented 9 years ago

@vAlmaraz for the last compilation issue, you need the latest fix from Vitallium/qtwebkit@07b6571e2356e5cf12029a3c2a9bf78b6472ed9f.

vAlmaraz commented 9 years ago

Thank you for your answer @astefanutti I have downloaded those version (current version didn't work) and try to build again, but after a while, it throws next errors:

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -mmacosx-version-min=10.7 -Wall -W -fPIC -DUSE_UTF8 -DSTATIC_BUILD -DQCOMMANDLINE_STATIC -DQT_NO_DEBUG -DQT_WEBKITWIDGETS_LIB -DQT_WIDGETS_LIB -DQT_WEBKIT_LIB -DQT_GUI_LIB -DQT_NETWORK_LIB -DQT_CORE_LIB -I. -Imongoose -Ilinenoise/src -Iqcommandline -Iqt/qtbase/include -Iqt/qtbase/include/QtWebKitWidgets -Iqt/qtbase/include/QtWidgets -Iqt/qtbase/include/QtWebKit -Iqt/qtbase/include/QtGui -Iqt/qtbase/include/QtNetwork -Iqt/qtbase/include/QtCore -I. -Iqt/qtbase/mkspecs/macx-clang -o system.o system.cpp cookiejar.cpp:45:12: error: use of overloaded operator '<<' is ambiguous (with operand types 'QDataStream' and 'int') stream << COOKIE_JAR_VERSION;


qt/qtbase/include/QtCore/../../src/corelib/tools/qchar.h:584:28: note: candidate function
Q_CORE_EXPORT QDataStream &operator<<(QDataStream &, QChar);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/kernel/qvariant.h:534:28: note: candidate function
Q_CORE_EXPORT QDataStream& operator<< (QDataStream& s, const QVariant& p);
                           ^
cookiejar.cpp:46:12: error: use of overloaded operator '<<' is ambiguous (with operand types 'QDataStream' and 'quint32' (aka 'unsigned int'))
    stream << quint32(list.size());
    ~~~~~~ ^  ~~~~~~~~~~~~~~~~~~~~
qt/qtbase/include/QtCore/../../src/corelib/tools/qchar.h:584:28: note: candidate function
Q_CORE_EXPORT QDataStream &operator<<(QDataStream &, QChar);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/kernel/qvariant.h:534:28: note: candidate function
Q_CORE_EXPORT QDataStream& operator<< (QDataStream& s, const QVariant& p);
                           ^
cookiejar.cpp:58:12: error: invalid operands to binary expression ('QDataStream' and 'quint32' (aka 'unsigned int'))
    stream >> version;
    ~~~~~~ ^  ~~~~~~~
qt/qtbase/include/QtCore/../../src/corelib/tools/qchar.h:585:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QChar &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QChar &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qbytearray.h:663:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QByteArray &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QByteArray &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qstring.h:1327:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QString &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QString &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qregexp.h:115:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QRegExp &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &in, QRegExp &regExp);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/kernel/qvariant.h:533:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QVariant &' for 2nd argument
Q_CORE_EXPORT QDataStream& operator>> (QDataStream& s, QVariant& p);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/kernel/qvariant.h:535:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QVariant::Type &' for 2nd argument
Q_CORE_EXPORT QDataStream& operator>> (QDataStream& s, QVariant::Type& p);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qlocale.h:1011:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QLocale &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QLocale &);
                           ^
qt/qtbase/include/QtNetwork/../../src/network/kernel/qhostaddress.h:137:31: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QHostAddress &' for 2nd argument
Q_NETWORK_EXPORT QDataStream &operator>>(QDataStream &, QHostAddress &);
                              ^
qt/qtbase/include/QtCore/../../src/corelib/io/qurl.h:399:28: note: candidate function not viable: no known conversion from 'quint32' (aka 'unsigned int')
      to 'QUrl &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QUrl &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qpoint.h:99:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QPoint &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QPoint &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qpoint.h:259:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QPointF &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QPointF &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qsize.h:95:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QSize &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QSize &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qsize.h:258:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QSizeF &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QSizeF &);
                           ^
qt/qtbase/include/QtGui/../../src/gui/kernel/qcursor.h:113:27: note: candidate function not viable: no known conversion from 'quint32' (aka 'unsigned int')
      to 'QCursor &' for 2nd argument
Q_GUI_EXPORT QDataStream &operator>>(QDataStream &inS, QCursor &cursor);
                          ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qdatetime.h:343:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QDate &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QDate &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qdatetime.h:345:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QTime &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QTime &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qdatetime.h:347:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QDateTime &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QDateTime &);
                           ^
cookiejar.cpp:53:14: note: candidate function not viable: no known conversion from 'quint32' (aka 'unsigned int') to 'QList<QNetworkCookie> &' for 2nd
      argument
QDataStream& operator>>(QDataStream& stream, QList<QNetworkCookie>& list)
             ^
qt/qtbase/include/QtCore/../../src/corelib/io/qtextstream.h:219:21: note: candidate function not viable: cannot convert argument of incomplete type
      'QDataStream' to 'QTextStream &'
inline QTextStream &operator>>(QTextStream &s, QTextStreamFunction f)
                    ^
cookiejar.cpp:65:12: error: invalid operands to binary expression ('QDataStream' and 'quint32' (aka 'unsigned int'))
    stream >> count;
    ~~~~~~ ^  ~~~~~
qt/qtbase/include/QtCore/../../src/corelib/tools/qchar.h:585:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QChar &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QChar &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qbytearray.h:663:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QByteArray &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QByteArray &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qstring.h:1327:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QString &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QString &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qregexp.h:115:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QRegExp &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &in, QRegExp &regExp);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/kernel/qvariant.h:533:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QVariant &' for 2nd argument
Q_CORE_EXPORT QDataStream& operator>> (QDataStream& s, QVariant& p);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/kernel/qvariant.h:535:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QVariant::Type &' for 2nd argument
Q_CORE_EXPORT QDataStream& operator>> (QDataStream& s, QVariant::Type& p);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qlocale.h:1011:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QLocale &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QLocale &);
                           ^
qt/qtbase/include/QtNetwork/../../src/network/kernel/qhostaddress.h:137:31: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QHostAddress &' for 2nd argument
Q_NETWORK_EXPORT QDataStream &operator>>(QDataStream &, QHostAddress &);
                              ^
qt/qtbase/include/QtCore/../../src/corelib/io/qurl.h:399:28: note: candidate function not viable: no known conversion from 'quint32' (aka 'unsigned int')
      to 'QUrl &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QUrl &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qpoint.h:99:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QPoint &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QPoint &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qpoint.h:259:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QPointF &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QPointF &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qsize.h:95:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QSize &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QSize &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qsize.h:258:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QSizeF &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QSizeF &);
                           ^
qt/qtbase/include/QtGui/../../src/gui/kernel/qcursor.h:113:27: note: candidate function not viable: no known conversion from 'quint32' (aka 'unsigned int')
      to 'QCursor &' for 2nd argument
Q_GUI_EXPORT QDataStream &operator>>(QDataStream &inS, QCursor &cursor);
                          ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qdatetime.h:343:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QDate &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QDate &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qdatetime.h:345:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QTime &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QTime &);
                           ^
qt/qtbase/include/QtCore/../../src/corelib/tools/qdatetime.h:347:28: note: candidate function not viable: no known conversion from 'quint32'
      (aka 'unsigned int') to 'QDateTime &' for 2nd argument
Q_CORE_EXPORT QDataStream &operator>>(QDataStream &, QDateTime &);
                           ^
cookiejar.cpp:53:14: note: candidate function not viable: no known conversion from 'quint32' (aka 'unsigned int') to 'QList<QNetworkCookie> &' for 2nd
      argument
QDataStream& operator>>(QDataStream& stream, QList<QNetworkCookie>& list)
             ^
qt/qtbase/include/QtCore/../../src/corelib/io/qtextstream.h:219:21: note: candidate function not viable: cannot convert argument of incomplete type
      'QDataStream' to 'QTextStream &'
inline QTextStream &operator>>(QTextStream &s, QTextStreamFunction f)
                    ^
cookiejar.cpp:76:19: error: member access into incomplete type 'QDataStream'
        if (stream.atEnd()) {
                  ^
qt/qtbase/include/QtCore/../../src/corelib/global/qglobal.h:570:7: note: forward declaration of 'QDataStream'
class QDataStream;
      ^
5 errors generated.
make[1]: **\* [cookiejar.o] Error 1
make[1]: **\* Waiting for unfinished jobs....
make: **\* [sub-src-phantomjs-pro-make_first-ordered] Error 2

ERROR: Failed to build PhantomJS! Building PhantomJS failed.
phantomjs was not built yet, please run build.sh first
astefanutti commented 9 years ago

That must be fixed by 091131f8e66740d8e361630620e1c449b690d820. Make sure you have all the latest changes from both submodules and main project. I've just managed to produce Mac OS X binary with Xcode 7.

vAlmaraz commented 9 years ago

Thank you again, I am able to build it!

Now... the problem is that body is an empty string in onResourceReceived event using latest version. It was working when fix was added (commit: https://github.com/ariya/phantomjs/commit/434d4e0101a540525e8f89a657ea553fb38b040b).

It has been reported here: https://github.com/ariya/phantomjs/issues/13498

jason-son commented 9 years ago

I'm using this binary https://github.com/Vitallium/phantomjs/releases/tag/2.0.1 on mac osx 10.11 found that the same problem respons.body is an empty string!

vassilevsky commented 8 years ago

I have successfully built PhantomJS from master at 190a927 on OS X El Capitan 🎉

Could you give me a script to run that tries to save the bodies? I don't like writing JavaScript 😭

vAlmaraz commented 8 years ago

Hi @vassilevsky,

Thank you for your support. Please, could you try with this script?

var page = require('webpage').create();
var fs = require('fs');
var path = 'output/';

page.onResourceReceived = function(response) {
    if (response.stage == 'end') {
        var filePath = response.url.substr(7);
        console.log('Saving: ' + filePath);
        fs.write(path + filePath + '_headers.txt', JSON.stringify(response), 'w');
        fs.write(path + filePath, response.body, 'w');
    }
};

page.open('http://google.es', function (status) {
    if (status !== 'success') {
        console.log('Unable to access the network!');
    } else {
        page.render('screenshot.png');
    }
    phantom.exit();
});

It should save headers and body in an output folder. Headers are currently saved, but body attribute in releases is not presented. When I compiled latest repository version, it contained always an empty string.

vassilevsky commented 8 years ago

Script output:

Saving: google.es/
Saving: www.google.es/
Saving: ssl.gstatic.com/gb/images/b_8d5afc09.png
Saving: www.google.es/images/icons/product/chrome-48.png
Saving: www.google.es/images/branding/googlelogo/1x/googlelogo_white_background_color_272x92dp.png
Saving: www.google.es/textinputassistant/tia.png
Saving: www.google.es/client_204?&atyp=i&biw=400&bih=300&ei=ytZtVuOXN8jXyQObiIBA
Saving: www.google.es/images/nav_logo229.png
Saving: www.google.es/xjs/_/js/k=xjs.hp.en_US.jCb2JWHbn2Y.O/m=sb_he,d/rt=j/d=1/t=zcms/rs=ACT90oFGQp49GxSuY0e6lmI--ryX7BUbGw

Unfortunately, you are right. Bodies are not saved:

$ find output -type f | xargs ls -l
-rw-r--r--  1 vassilevsky  staff   699 13 дек 23:36 output/google.es/_headers.txt
-rw-r--r--  1 vassilevsky  staff     0 13 дек 23:36 output/ssl.gstatic.com/gb/images/b_8d5afc09.png
-rw-r--r--  1 vassilevsky  staff   750 13 дек 23:36 output/ssl.gstatic.com/gb/images/b_8d5afc09.png_headers.txt
-rw-r--r--  1 vassilevsky  staff  1202 13 дек 23:36 output/www.google.es/_headers.txt
-rw-r--r--  1 vassilevsky  staff     0 13 дек 23:36 output/www.google.es/client_204?&atyp=i&biw=400&bih=300&ei=ytZtVuOXN8jXyQObiIBA
-rw-r--r--  1 vassilevsky  staff   563 13 дек 23:36 output/www.google.es/client_204?&atyp=i&biw=400&bih=300&ei=ytZtVuOXN8jXyQObiIBA_headers.txt
-rw-r--r--  1 vassilevsky  staff     0 13 дек 23:36 output/www.google.es/images/branding/googlelogo/1x/googlelogo_white_background_color_272x92dp.png
-rw-r--r--  1 vassilevsky  staff   736 13 дек 23:36 output/www.google.es/images/branding/googlelogo/1x/googlelogo_white_background_color_272x92dp.png_headers.txt
-rw-r--r--  1 vassilevsky  staff     0 13 дек 23:36 output/www.google.es/images/icons/product/chrome-48.png
-rw-r--r--  1 vassilevsky  staff   691 13 дек 23:36 output/www.google.es/images/icons/product/chrome-48.png_headers.txt
-rw-r--r--  1 vassilevsky  staff     0 13 дек 23:36 output/www.google.es/images/nav_logo229.png
-rw-r--r--  1 vassilevsky  staff   683 13 дек 23:36 output/www.google.es/images/nav_logo229.png_headers.txt
-rw-r--r--  1 vassilevsky  staff     0 13 дек 23:36 output/www.google.es/textinputassistant/tia.png
-rw-r--r--  1 vassilevsky  staff   684 13 дек 23:36 output/www.google.es/textinputassistant/tia.png_headers.txt
-rw-r--r--  1 vassilevsky  staff     0 13 дек 23:36 output/www.google.es/xjs/_/js/k=xjs.hp.en_US.jCb2JWHbn2Y.O/m=sb_he,d/rt=j/d=1/t=zcms/rs=ACT90oFGQp49GxSuY0e6lmI--ryX7BUbGw
-rw-r--r--  1 vassilevsky  staff   846 13 дек 23:36 output/www.google.es/xjs/_/js/k=xjs.hp.en_US.jCb2JWHbn2Y.O/m=sb_he,d/rt=j/d=1/t=zcms/rs=ACT90oFGQp49GxSuY0e6lmI--ryX7BUbGw_headers.txt
zackw commented 8 years ago

Thanks for the very clear test case. I think @Vitallium is working on it, but if anyone else wants to dig into the code and send us a pull request, that would probably speed matters up.

erikdubbelboer commented 8 years ago

You have to set page.captureContent, see: https://github.com/ariya/phantomjs/blob/f69d44b829aaf281441edff7506dde2d5c33ad07/test/module/webpage/capture-content.js#L13

jason-son commented 8 years ago

@erikdubbelboer that's the key,thanks a lot! page.captureContent = ['.*']; it's working! @vAlmaraz also can refer to this docs of slimerjs http://docs.slimerjs.org/current/api/webpage.html#capturecontent

vAlmaraz commented 8 years ago

Hi All,

Thank you for the help. I have tried but it doesn't work (PhantomJS 2.0 on Windows 10).

This is the code:

var page = require('webpage').create();
var fs = require('fs');
var path = 'output/';

page.captureContent = ['.*'];

page.onResourceReceived = function(response) {
    if (response.stage == 'end') {
        var filePath = response.url.substr(7);
        console.log('Saving: ' + filePath);
        fs.write(path + filePath + '_headers.txt', JSON.stringify(response), 'w');
        fs.write(path + filePath, response.body, 'w');
    }
};

page.open('http://google.es', function (status) {
    if (status !== 'success') {
        console.log('Unable to access the network!');
    } else {
        page.render('screenshot.png');
    }
    phantom.exit();
});

Am I doing something wrong?

vitallium commented 8 years ago

Just tried your sample with PhantomJS 2.0.1 preview. It works:

│   body.js
│   phantomjs.exe
│   screenshot.png
│
└───output
    ├───clients1.google.es
    │       generate_204
    │       generate_204_headers.txt
    │
    ├───ssl.gstatic.com
    │   └───gb
    │       └───images
    │               b_8d5afc09.png
    │               b_8d5afc09.png_headers.txt
    │
    └───www.google.es
        │   _headers.txt
        │
        ├───images
        │   │   nav_logo229.png
        │   │   nav_logo229.png_headers.txt
        │   │
        │   ├───branding
        │   │   └───googlelogo
        │   │       └───1x
        │   │               googlelogo_white_background_color_272x92dp.png
        │   │               googlelogo_white_background_color_272x92dp.png_headers.txt
        │   │
        │   └───icons
        │       └───product
        │               chrome-48.png
        │               chrome-48.png_headers.txt
        │
        └───xjs
            └───_
                └───js
                    └───k=xjs.hp.en_US.jCb2JWHbn2Y.O
                        └───m=sb_he,d
                            └───rt=j
                                └───d=1
                                    └───t=zcms
                                            rs=ACT90oFGQp49GxSuY0e6lmI--ryX7BUbGw
                                            rs=ACT90oFGQp49GxSuY0e6lmI--ryX7BUbGw_headers.txt
vAlmaraz commented 8 years ago

Thank you very much! It works. This is the link to download Preview, in case anyone needs it: https://github.com/Vitallium/phantomjs/releases/tag/2.0.1

Now, when I execute the script, a dialog prompts noticing me that "There is no disk in the drive. Please insert a disk into D:".

Anyway, what encoding is using PhantomJS for images? It saves them but it contains several unicode encoding characters. I have tried to decode them without success:

var unicodeRegex = /\\u([\d\w]{4})/gi;
var body = response.body.replace(unicodeRegex, function (match, grp) {
    return String.fromCharCode(parseInt(grp, 16)); 
});
body = unescape(body);

Thanks a lot!

zackw commented 8 years ago

@vAlmaraz Images are binary data. They shouldn't be saved as text in the first place. I modified the script above as follows:

var page = require('webpage').create();
var fs = require('fs');
var path = 'output/';

page.captureContent = ['.*'];

page.onResourceReceived = function(response) {
    if (response.stage == 'end') {
        var filePath = response.url.substr(7);
        if (filePath.substr(-1) == '/') {
            filePath += 'index.html';
        }
        console.log('Saving: ' + filePath);
        fs.write(path + filePath + '_headers.txt', JSON.stringify(response), 'w');
        fs.write(path + filePath, response.body, 'wb');
    }
};

page.open('http://google.es', function (status) {
    if (status !== 'success') {
        console.log('Unable to access the network!');
    } else {
        page.render('screenshot.png');
    }
    phantom.exit();
});

and now I get valid PNGs saved. The key change is to write out response.body using mode 'wb' instead of just 'w'.


Since this turns out not to be a bug after all I am going to go ahead and close the issue. If you need more help please take it to either

so that we can keep the bug tracker for things that are definitely bugs.

If you have not read it yet, please check also a few tips on Effective Q&A.