webrecorder / browsertrix-old

Browsertrix: Containerized High-Fidelity Browser-Based Automated Crawling + Behavior System
Apache License 2.0
88 stars 7 forks source link

xserver problem #32

Open Feribv opened 5 years ago

Feribv commented 5 years ago

Hello, I've installed browsertrix on Centos 7 with Python 3.7 and the crawling is creating warc files, but it is not using the remote browser. On the docker oldwebtoday/vnc-webrtc-audio image I get the error below:

docker logs xserver-TGLPNODRRZGTJNOCRA3C53XG

Start PulseAudio

W: [pulseaudio] main.c: This program is not intended to be run as root (unless --system is specified). stored passwd in file: /root/.vnc/passwd

Start WebRTC Audio pipeline

Execute 'python3 -u /app/webrtc.py --port 6082'

(gst-plugin-scanner:25): CRITICAL : 11:07:20.667: Couldn't g_module_open libpython. Reason: /usr/lib/libpython2.7.so: cannot open shared object file: No such file or directory Signaling: Listening on https://0.0.0.0:6082

there is not /usr/lib/libpython2.7.so on the server, just in the lib64.

On the oldwebtoday/chrome:73 image I get:

docker logs browser-GJLDLADDQGKJSN73F3LBI7ET

IP: 172.31.0.2 --2019-08-23 11:13:55-- http://wsgiprox/download/pem Connecting to 172.31.0.2:8080... connected. Proxy request sent, awaiting response... 200 OK Length: 2811 (2.7K) [application/x-x509-ca-cert] Saving to: '/tmp/proxy-ca.pem'

 0K ..                                                    100%  264M=0s

2019-08-23 11:13:55 (264 MB/s) - '/tmp/proxy-ca.pem' saved [2811/2811]

Error opening input terminal for read Chrome Not Found Chrome Not Found Chrome Not Found Chrome Not Found [42:42:0823/111401.507623:ERROR:browser_dm_token_storage_linux.cc(101)] Error: /etc/machine-id contains 0 characters (32 were expected). Chrome Not Found Chrome Not Found Restarting process Chrome Not Found Chrome Not Found 2019/08/23 11:14:03 socat[109] E connect(5, AF=2 127.0.0.1:9221, 16): Connection refused [42:111:0823/111403.603649:ERROR:bus.cc(396)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory Chrome Not Found 2019/08/23 11:14:04 socat[148] E connect(5, AF=2 127.0.0.1:9221, 16): Connection refused [125:125:0823/111404.323891:ERROR:sandbox_linux.cc(364)] InitializeSandbox() called with multiple threads in process gpu-process. Chrome Not Found 2019/08/23 11:14:04 socat[331] E connect(5, AF=2 127.0.0.1:9221, 16): Connection refused [42:329:0823/111404.904280:ERROR:bus.cc(396)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix") Chrome Not Found

DevTools listening on ws://127.0.0.1:9221/devtools/browser/ba447e2c-1427-4995-9faa-c13e76282141 Chrome Not Found Restarting process

(google-chrome:42): LIBDBUSMENU-GLIB-WARNING **: 11:14:05.646: Unable to get session bus: Unknown or unsupported transport ?disabled? for address ?disabled:? Chrome Not Found Chrome Found

Any idea what is going wrong with browsertrix? Thanks!