scrapinghub / portia

Visual scraping for Scrapy
BSD 3-Clause "New" or "Revised" License
9.3k stars 1.41k forks source link

Portia browser not working #752

Closed hashsup closed 7 years ago

hashsup commented 7 years ago

Hello,

I installed portia from scratch using docker on windows, when I try to set an spider, browser only show an empty page, but I typed nyt.com as page and it was redirected correctly to nytimes.com

I tested with chrome, edge, iexplorer, firefox and appear same error this is a printscreen: http://imgur.com/a/2PlBr

This is how I deploy and notice that appear:

PS F:\home\portia> docker run -i -t --rm --name portia -v /f/home/portia:/app/data/projects:rw -p 9001:9001 scrapinghub/portia
time="2017-03-08T15:20:41+0000" level=info msg="Unable to use system certificate pool: crypto/x509: system root pool is not available on Windows"
+ action=
+ shift
+ '[' -z '' ']'
+ _run
+ service nginx start
+ _set_env
+ path=/app/portia_server:/app/slyd:/app/slybot
+ export PYTHONPATH=/app/portia_server:/app/slyd:/app/slybot
+ PYTHONPATH=/app/portia_server:/app/slyd:/app/slybot
+ echo /app/portia_server:/app/slyd:/app/slybot
/app/portia_server:/app/slyd:/app/slybot
+ /app/portia_server/manage.py runserver
+ /app/slyd/bin/slyd -p 9002 -r /app/portiaui/dist
2017-03-08 15:20:48+0000 [-] Log opened.
2017-03-08 15:20:48.056698 [-] Splash version: 2.3.2
2017-03-08 15:20:48.059932 [-] WARNING: Lua scripting is not available because 'lupa' Python package is not installed
2017-03-08 15:20:48.061402 [-] Qt 5.5.1, PyQt 5.5.1, WebKit 538.1, sip 4.17, Twisted 16.1.1
2017-03-08 15:20:48.062060 [-] Python 2.7.6 (default, Oct 26 2016, 20:30:19) [GCC 4.8.4]
2017-03-08 15:20:48.062224 [-] Open files limit: 1048576
2017-03-08 15:20:48.062361 [-] Can't bump open files limit
2017-03-08 15:20:48.441041 [-] Xvfb is started: ['Xvfb', ':1869697841', '-screen', '0', '1024x768x24', '-nolisten', 'tcp']
2017-03-08 15:20:53.660729 [-] Site starting on 9002
2017-03-08 15:20:53.663764 [-] Starting factory <slyd.server.Site instance at 0x7f7e39c385f0>
Performing system checks...

System check identified no issues (0 silenced).
March 08, 2017 - 15:20:55
Django version 1.10.5, using settings 'portia_server.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
[08/Mar/2017 15:21:05] "GET /api/projects HTTP/1.0" 200 1212
[08/Mar/2017 15:21:05] "GET /server_capabilities HTTP/1.0" 200 207
[08/Mar/2017 15:21:07] "GET /api/projects/testing/spiders HTTP/1.0" 200 78
[08/Mar/2017 15:21:08] "GET /api/projects/testing/schemas HTTP/1.0" 200 78
[08/Mar/2017 15:21:08] "GET /api/projects/testing HTTP/1.0" 200 633
process 13: D-Bus library appears to be incorrectly set up; failed to read machine uuid: Failed to open "/etc/machine-id": No such file or directory
See the manual page for dbus-uuidgen to correct this issue.
QNetworkReplyImplPrivate::error: Internal problem, this method must only be called once.
QNetworkReplyImplPrivate::error: Internal problem, this method must only be called once.

Any idea where I can see to know what is happening?

ruairif commented 7 years ago

Fixed in #753. Looks like it failed to load the javascript file that interacts with splash correctly

rootinshell commented 7 years ago

Hello, i've got the same issue

root@nodejs ~# docker run -v /home/portia:/app/data/projects:rw -p 9001:9001 scr                                                                                        apinghub/portia
+ action=
+ shift
+ '[' -z '' ']'
+ _run
+ service nginx start
+ _set_env
+ path=/app/portia_server:/app/slyd:/app/slybot
+ export PYTHONPATH=/app/portia_server:/app/slyd:/app/slybot
+ PYTHONPATH=/app/portia_server:/app/slyd:/app/slybot
+ echo /app/portia_server:/app/slyd:/app/slybot
+ /app/portia_server/manage.py runserver
+ /app/slyd/bin/slyd -p 9002 -r /app/portiaui/dist
/app/portia_server:/app/slyd:/app/slybot
2017-03-08 23:12:13+0000 [-] Log opened.
2017-03-08 23:12:13.377621 [-] Splash version: 2.3.2
2017-03-08 23:12:13.506991 [-] WARNING: Lua scripting is not available because '                                                                                        lupa' Python package is not installed
2017-03-08 23:12:13.507167 [-] Qt 5.5.1, PyQt 5.5.1, WebKit 538.1, sip 4.17, Twi                                                                                        sted 16.1.1
2017-03-08 23:12:13.507241 [-] Python 2.7.6 (default, Oct 26 2016, 20:30:19) [GC                                                                                        C 4.8.4]
2017-03-08 23:12:13.507313 [-] Open files limit: 1048576
2017-03-08 23:12:13.507371 [-] Can't bump open files limit
2017-03-08 23:12:13.925158 [-] Xvfb is started: ['Xvfb', ':639572937', '-screen'                                                                                        , '0', '1024x768x24', '-nolisten', 'tcp']
2017-03-08 23:12:28.378694 [-] Site starting on 9002
2017-03-08 23:12:28.378900 [-] Starting factory <slyd.server.Site instance at 0x                                                                                        7fc65d93ed40>
[08/Mar/2017 23:12:48] "GET /api/projects HTTP/1.0" 200 7030
[08/Mar/2017 23:12:48] "GET /server_capabilities HTTP/1.0" 200 207
[08/Mar/2017 23:12:59] "GET /api/projects/medicament/spiders HTTP/1.0" 200 81
[08/Mar/2017 23:12:59] "GET /api/projects/medicament/schemas HTTP/1.0" 200 81
[08/Mar/2017 23:13:00] "GET /api/projects/medicament HTTP/1.0" 200 654
process 13: D-Bus library appears to be incorrectly set up; failed to read machi                                                                                        ne uuid: Failed to open "/etc/machine-id": No such file or directory
See the manual page for dbus-uuidgen to correct this issue.
[08/Mar/2017 23:15:07] "GET /api/projects/medicament/download?format=code HTTP/1                                                                                        .0" 200 9819
[08/Mar/2017 23:15:22] "POST /api/projects/medicament/spiders HTTP/1.0" 201 1009
[08/Mar/2017 23:15:22] "GET /api/projects/medicament/spiders/medicament.ma HTTP/                                                                                        1.0" 200 1082
[08/Mar/2017 23:15:22] "GET /api/projects/medicament/spiders/medicament.ma/sampl                                                                                        es HTTP/1.0" 200 103
2017-03-08 23:15:23.087625 [-] /usr/local/lib/python2.7/dist-packages/scrapy/lin                                                                                        k.py:21: exceptions.UserWarning: Link urls must be str objects. Assuming utf-8 e                                                                                        ncoding (which could be wrong)
[08/Mar/2017 23:15:44] "GET /api/projects/medicament/spiders/medicament.ma HTTP/                                                                                        1.0" 200 1082
[08/Mar/2017 23:15:50] "PATCH /api/projects/medicament/spiders/medicament.ma HTT                                                                                        P/1.0" 200 1002
[08/Mar/2017 23:15:53] "PATCH /api/projects/medicament/spiders/medicament.ma HTT                                                                                        P/1.0" 200 1001
[08/Mar/2017 23:15:55] "PATCH /api/projects/medicament/spiders/medicament.ma HTT                                                                                        P/1.0" 200 1002

any idea how to debug ?

ruairif commented 7 years ago

@rootinshell Checkout the changes in https://github.com/scrapinghub/portia/pull/753 rebuild the assets, then it should run

hashsup commented 7 years ago

I wan't able to rebuild portia, but I found the way to modify combined.js and move inject_this.js script from first to last possition but nothing change, are you sure this solve the issue?

btotherunner commented 7 years ago

i have the same problem:

[root@kibasrv01 var]# docker run -i -t --rm -v /var/portia:/app/data/projects:rw -p 9001:9001 scrapinghub/portia

System check identified no issues (0 silenced). March 13, 2017 - 15:11:40 Django version 1.10.5, using settings 'portia_server.settings' Starting development server at http://127.0.0.1:8000/ Quit the server with CONTROL-C. process 15: D-Bus library appears to be incorrectly set up; failed to read machine uuid: Failed to open "/etc/machine-id": No such file or directory See the manual page for dbus-uuidgen to correct this issue.

Message from syslogd@kibasrv01 at Mar 13 16:12:23 ... kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1