Closed agogodavid closed 3 years ago
I saw this.. 537 ... did not fix it
To help with your issue:
1) Please list the installation steps you took (the output of running history
should do it)
2) Please paste the output of conda list
run from the same shell you're running python demo.py
in
3) Please paste the output of ls
in the OpenWPM directory.
4) Please paste the output of running install.sh
(re-running it won't cause a problem)
Also, are you running on a fresh install of Ubuntu 18.04 Server?
Yes, fresh install of Ubuntu 18.04.4 LTS (GNU/Linux 5.3.0-1018-gcp x86_64) Running on Google Cloud platform- I just retried on a vanilla install (I had setup jupyter notebook before installing OpenWpm before)
(openwpm) agogodavid@quickvanillatry:~/OpenWpm$ history 1 cd OpenWpm/ 2 ./install.sh 3 ls 4 ./install.sh 5 cd tmp 6 ls 7 bash Anaconda3-2020.02-Linux-x86_64.sh 8 conda 9 source ~/.bashrc 10 conda 11 cd .. 12 ls 13 ./install.sh 14 ls 15 python demo.py 16 conda activate openwpm 17 python demo.py 18 history
(openwpm) agogodavid@quickvanillatry:~/OpenWpm$ history **** # packages in environment at /home/agogodavid/anaconda3/envs/openwpm: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 1_llvm conda-forge abseil-cpp 20200225.2 he1b5a44_0 conda-forge amazon-kclpy 2.0.1 pypi_0 pypi appdirs 1.4.3 py_1 conda-forge argparse 1.4.0 pypi_0 pypi arrow-cpp 0.17.0 py38he7e9c3a_2_cpu conda-forge attrs 19.3.0 py_0 conda-forge autopep8 1.5.2 pyh9f0ad1d_0 conda-forge aws-sam-translator 1.23.0 pypi_0 pypi aws-sdk-cpp 1.7.164 hc831370_1 conda-forge aws-xray-sdk 2.5.0 pypi_0 pypi backcall 0.1.0 py_0 conda-forge beautifulsoup4 4.9.0 py38h32f6830_0 conda-forge boost-cpp 1.72.0 h8e57a91_0 conda-forge boto 2.49.0 pypi_0 pypi boto3 1.13.12 pyh9f0ad1d_0 conda-forge botocore 1.16.12 pyh9f0ad1d_0 conda-forge brotli 1.0.7 he1b5a44_1001 conda-forge brotlipy 0.7.0 py38h1e0a361_1000 conda-forge bzip2 1.0.8 h516909a_2 conda-forge c-ares 1.15.0 h516909a_1001 conda-forge ca-certificates 2020.4.5.1 hecc5488_0 conda-forge certifi 2020.4.5.1 py38h32f6830_0 conda-forge cffi 1.14.0 py38hd463f26_0 conda-forge cfgv 3.1.0 py_0 conda-forge cfn-lint 0.31.1 pypi_0 pypi chardet 3.0.4 pypi_0 pypi click 7.1.2 pypi_0 pypi crontab 0.22.6 pypi_0 pypi cryptography 2.9.2 py38h766eaa4_0 conda-forge curl 7.69.1 h33f0ec9_0 conda-forge decorator 4.4.2 py_0 conda-forge dill 0.3.1.1 py38h32f6830_1 conda-forge distlib 0.3.0 pyh9f0ad1d_0 conda-forge dnslib 0.9.12 py_0 conda-forge dnspython 1.16.0 py_1 conda-forge docker 4.2.0 pypi_0 pypi docopt 0.6.2 py_1 conda-forge docutils 0.15.2 py38_0 conda-forge domain-utils 0.7.1 pypi_0 pypi easyprocess 0.2.10 py38_0 conda-forge
(base) agogodavid@quickvanillatry:~/OpenWpm$ ls CHANGELOG Dockerfile README.md __init__.py crawler.py docs firefox-bin scripts test CODE_OF_CONDUCT.md LICENSE VERSION automation demo.py environment.yaml install.sh setup.cfg tmp
(base) agogodavid@quickvanillatry:~/OpenWpm$ ./install.sh Creating / Overwriting openwpm conda environment. Collecting package metadata (repodata.json): ...working... done Solving environment: ...working... done Preparing transaction: ...working... done Verifying transaction: ...working... done Executing transaction: ...working... done Activating environment. Installing firefox. Installing for Linux Firefox succesfully installed Building extension. ~/OpenWpm/automation/Extension/firefox ~/OpenWpm > dtrace-provider@0.8.8 install /home/agogodavid/OpenWpm/automation/Extension/firefox/node_modules/dtrace-provider > node-gyp rebuild || node suppress-error.js gyp ERR! build error gyp ERR! stack Error: not found: make gyp ERR! stack at getNotFoundError (/home/agogodavid/anaconda3/envs/openwpm/lib/node_modules/npm/node_modules/which/which.js:13:12) gyp ERR! stack at F (/home/agogodavid/anaconda3/envs/openwpm/lib/node_modules/npm/node_modules/which/which.js:68:19) gyp ERR! stack at E (/home/agogodavid/anaconda3/envs/openwpm/lib/node_modules/npm/node_modules/which/which.js:80:29) gyp ERR! stack at /home/agogodavid/anaconda3/envs/openwpm/lib/node_modules/npm/node_modules/which/which.js:89:16 gyp ERR! stack at /home/agogodavid/anaconda3/envs/openwpm/lib/node_modules/npm/node_modules/isexe/index.js:42:5 gyp ERR! stack at /home/agogodavid/anaconda3/envs/openwpm/lib/node_modules/npm/node_modules/isexe/mode.js:8:5 gyp ERR! stack at FSReqCallback.oncomplete (fs.js:171:21) gyp ERR! System Linux 5.3.0-1018-gcp gyp ERR! command "/home/agogodavid/anaconda3/envs/openwpm/bin/node" "/home/agogodavid/anaconda3/envs/openwpm/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "rebuild" gyp ERR! cwd /home/agogodavid/OpenWpm/automation/Extension/firefox/node_modules/dtrace-provider gyp ERR! node -v v13.13.0 gyp ERR! node-gyp -v v5.1.0 gyp ERR! not ok > core-js@2.6.11 postinstall /home/agogodavid/OpenWpm/automation/Extension/firefox/node_modules/core-js > node -e "try{require('./postinstall')}catch(e){}"
(base) agogodavid@quickvanillatry:~/OpenWpm$ conda activate openwpm (openwpm) agogodavid@quickvanillatry:~/OpenWpm$ python demo.py BrowserManager - INFO - BROWSER 1: Launching browser... BrowserManager - ERROR - BROWSER 1: Crash in driver, restarting browser manager Traceback (most recent call last): File "/home/agogodavid/OpenWpm/automation/BrowserManager.py", line 446, in BrowserManager driver, prof_folder, browser_settings = deploy_browser.deploy_browser( File "/home/agogodavid/OpenWpm/automation/DeployBrowsers/deploy_browser.py", line 13, in deploy_browser return deploy_firefox.deploy_firefox(status_queue, browser_params, File "/home/agogodavid/OpenWpm/automation/DeployBrowsers/deploy_firefox.py", line 202, in deploy_firefox driver = webdriver.Firefox(firefox_profile=fp, firefox_binary=fb, File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/site-packages/selenium/webdriver/firefox/webdriver.py", line 170, in __init__ RemoteWebDriver.__init__( File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 157, in __init__ self.start_session(capabilities, browser_profile) File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session response = self.execute(Command.NEW_SESSION, parameters) File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute self.error_handler.check_response(response) File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: invalid argument: can't kill an exited process BrowserManager - ERROR - BROWSER 1: Spawn unsuccessful | Proxy Ready: False | Profile Created: True | Profile Tar: True | Display: True | Launch Attempted: True | Browser Launched: False | Browser Ready: False BrowserManager - ERROR - BROWSER 1: Crash in driver, restarting browser manager Traceback (most recent call last): File "/home/agogodavid/OpenWpm/automation/BrowserManager.py", line 446, in BrowserManager driver, prof_folder, browser_settings = deploy_browser.deploy_browser( File "/home/agogodavid/OpenWpm/automation/DeployBrowsers/deploy_browser.py", line 13, in deploy_browser return deploy_firefox.deploy_firefox(status_queue, browser_params, File "/home/agogodavid/OpenWpm/automation/DeployBrowsers/deploy_firefox.py", line 202, in deploy_firefox driver = webdriver.Firefox(firefox_profile=fp, firefox_binary=fb, File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/site-packages/selenium/webdriver/firefox/webdriver.py", line 170, in __init__ RemoteWebDriver.__init__( False | Browser Ready: False BrowserManager - ERROR - BROWSER 1: Crash in driver, restarting browser manager Traceback (most recent call last): File "/home/agogodavid/OpenWpm/automation/BrowserManager.py", line 446, in BrowserManager driver, prof_folder, browser_settings = deploy_browser.deploy_browser( File "/home/agogodavid/OpenWpm/automation/DeployBrowsers/deploy_browser.py", line 13, in deploy_browser return deploy_firefox.deploy_firefox(status_queue, browser_params, File "/home/agogodavid/OpenWpm/automation/DeployBrowsers/deploy_firefox.py", line 202, in deploy_firefox driver = webdriver.Firefox(firefox_profile=fp, firefox_binary=fb, File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/site-packages/selenium/webdriver/firefox/webdriver.py", line 170, in __init__ RemoteWebDriver.__init__( File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 157, in __init__ self.start_session(capabilities, browser_profile) File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session response = self.execute(Command.NEW_SESSION, parameters) File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute self.error_handler.check_response(response) File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: invalid argument: can't kill an exited process BrowserManager - ERROR - BROWSER 1: Spawn unsuccessful | Proxy Ready: False | Profile Created: True | Profile Tar: True | Display: True | Launch Attempted: True | Browser Launched: False | Browser Ready: False TaskManager - CRITICAL - Browser spawn failure during TaskManager initialization, exiting... BaseAggregator - INFO - Received shutdown signal! BaseAggregator - INFO - Queue was flushed completely Shutdown took 46.42465305328369 seconds XPCOMGlueLoad error for file /home/agogodavid/OpenWpm/firefox-bin/libmozgtk.so: libgtk-3.so.0: cannot open shared object file: No such file or directory Couldn't load XPCOM. Traceback (most recent call last): File "/home/agogodavid/OpenWpm/automation/utilities/platform_utils.py", line 76, in get_version firefox = subprocess.check_output([firefox_binary_path, "--version"]) File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/subprocess.py", line 411, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/home/agogodavid/anaconda3/envs/openwpm/lib/python3.8/subprocess.py", line 512, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['/home/agogodavid/OpenWpm/firefox-bin/firefox-bin', '--version']' returned non-zero exit status 255. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "demo.py", line 37, inmanager = TaskManager.TaskManager(manager_params, browser_params) File "/home/agogodavid/OpenWpm/automation/TaskManager.py", line 156, in __init__ openwpm_v, browser_v = get_version() File "/home/agogodavid/OpenWpm/automation/utilities/platform_utils.py", line 78, in get_version raise RuntimeError("Firefox not found. " RuntimeError: Firefox not found. Did you run `./install.sh`?
Thanks so much!
The problem is:
Shutdown took 46.42465305328369 seconds
XPCOMGlueLoad error for file /home/agogodavid/OpenWpm/firefox-bin/libmozgtk.so:
libgtk-3.so.0: cannot open shared object file: No such file or directory
Couldn't load XPCOM.
Your ubuntu server installation is missing some dependencies.
These dependencies are listed here: https://github.com/mozilla/OpenWPM/blob/master/Dockerfile#L14-L16
I knew about this installation challenge and I noted it here: https://github.com/mozilla/OpenWPM/#troubleshooting. But my description is very vague, and I don't think you would reasonably have found it.
My prior belief was that server deployments would be done using Docker and so this is handled by the dockerfile. This is not the case, so what can we do?
One of my big goals with the conda rework was to not have bash scripts that affect the user's operating system, especially no sudo scripts. We do have the ./scripts/install-minicona.sh
script, which affects at an OS level. It was deliberately not included in the ./install.sh
script.
Another goal was code de-duplication, so that we only have one place to maintain things.
I think my proposal is:
apt-get install
step in the docker file into a script ./scripts/install-ubuntu-server-deps.sh
.Dockerfile
in an extra stepcc @englehardt @vringar @agogodavid for thoughts.
@birdsarah is it possible to add these dependencies to conda?
@birdsarah is it possible to add these dependencies to conda?
Depends. Let's review individually and be clear about the use case we're solving for:
Can we conda install the package:
Can we be really clear about the use case we're trying to support that we're missing: what users, in what situations, with what skills, trying to accomplish what?
Given that we're on a server, I'm imagining a user who can figure out how to apt-get install
wget, git, and xvfb, and will probably do that manually anyway, so what level of extra support to they need.
If you're wondering why this issue is only arising now it's because we previously apt-get installed firefox
before then downloading the unbranded build, so firefox's gtk dependencies were resolved that way.
Did a quick test of conda installing gtk-3
and it doesn't seem to link correctly with the downloaded firefox. There might be some kind of a solution, but I don't know it.
Speaking as a noob, I think the use case I was after was to get OpenWpm working and be able to work with it from within a jupyter notebook or similar. That has not been easy to setup so I'm currently running it as a batch-process. Load in the urls into a python file, modify the parameters and then copy to docker and run. I could imagine some benefits of greater interaction with OpenWpm in form of a notebook, but this suffices for my use case. Hope this helps the decision making process. Great work!
Speaking as a noob,
These are the best because they bring us fresh perspective which we've long lost.
I think the use case I was after was to get OpenWpm working and be able to work with it from within a jupyter notebook or similar.
I love jupyter notebooks and use them as a core part of my work. With my work on Bokeh and Dask I'm also have some perspective on the challenges of trying to use OpenWPM via jupyter and I would say its hard and would be very difficult to provide a good/general experience.
Hope this helps the decision making process.
Really appreciate your perspective. Can you share a little more about getting your server setup and what would have made it easier?
Sorry for taking a moment - life in the time of covid....
My big challenge with the setup that worked eventually (docker) was my prior lack of knowledge about docker and how to get a file into the docker container to run the crawls - that was why I went with editing demo.py and then rerunning the docker install script. Anything that makes that part of the process a little clearer would have made my workflow more straightforward.
Another thing is the data dictionaries. I actually would like to contribute some of that based on my understanding of the dataset (planning a blog post or something). In my current use case I ended up needing only the source url and target url (although my initial plan was to match the new crawled dataset to an original data taken from Mozilla's Lightbeam circa 2016).
Hey @agogodavid, sorry for following up on this after such a long time. Did you end up writing that blogpost and if so would you care to share it? My takeaway from this issue is that our setup scripts are incomplete and we need to expand them and that our documentation needs to improve.
... Your ubuntu server installation is missing some dependencies. ...
@birdsarah I also got the same issue. I gave up and just switched to desktop version. It works well now on Ubuntu desktop . It'd be cool to run OpenWPM also on the server versions.
You can run it on the server version once you've installed the required dependencies and unfortunately we can't just change Firefox to support running it without libgtk.
@vringar Hello, I installed OpenWPM on Ubuntu 18.04 Server (On Google Cloud) and I followed step by step on https://github.com/openwpm/OpenWPM. Before, I build it on Ubuntu 18.04 on VMware workstation and everything is working properly. i think the problem is because on VMware use display_mode=native and on google cloud use display_mode=headless. At openwpm/config.py, I change some code: Line 94, I use display_mode: Literal["native", "headless", "xvfb"] = "headless" instead of display_mode: Literal["native", "headless", "xvfb"] = "native" I run demo.py and have a bug:
(openwpm) g21520667@ubuntu-desktop:~/OpenWPM$ python3 demo.py
browser_manager - INFO - BROWSER 1105956086: Launching browser...
browser_manager - ERROR - BROWSER 1105956086: Crash in driver, restarting browser manager
Traceback (most recent call last):
File "/home/g21520667/OpenWPM/openwpm/browser_manager.py", line 716, in run
driver, browser_profile_path, display = deploy_firefox.deploy_firefox(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/g21520667/OpenWPM/openwpm/deploy_browsers/deploy_firefox.py", line 147, in deploy_firefox
driver = webdriver.Firefox(
^^^^^^^^^^^^^^^^^^
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/site-packages/selenium/webdriver/firefox/webdriver.py", line 67, in __init__
super().__init__(command_executor=executor, options=options)
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 206, in __init__
self.start_session(capabilities)
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 290, in start_session
response = self.execute(Command.NEW_SESSION, caps)["value"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 345, in execute
self.error_handler.check_response(response)
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/site-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status 1
browser_manager - ERROR - BROWSER 1105956086: Spawn unsuccessful | Profile Created: True | Profile Tar: True | Display: True | Launch Attempted: True | Browser Launched: False | Browser Ready: False
^CTraceback (most recent call last):
File "/home/g21520667/OpenWPM/openwpm/browser_manager.py", line 183, in launch_browser_manager
self.geckodriver_pid = check_queue(launch_status)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/g21520667/OpenWPM/openwpm/browser_manager.py", line 150, in check_queue
raise BrowserCrashError("Browser spawn returned failure status")
openwpm.errors.BrowserCrashError: Browser spawn returned failure status
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/g21520667/OpenWPM/demo.py", line 62, in <module>
with TaskManager(
^^^^^^^^^^^^
File "/home/g21520667/OpenWPM/openwpm/task_manager.py", line 123, in __init__
Process StorageController:
self._launch_browsers()
Traceback (most recent call last):
File "/home/g21520667/OpenWPM/openwpm/task_manager.py", line 184, in _launch_browsers
success = browser.launch_browser_manager()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/g21520667/OpenWPM/openwpm/browser_manager.py", line 214, in launch_browser_manager
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
self.close_browser_manager()
File "/home/g21520667/OpenWPM/openwpm/storage/storage_controller.py", line 346, in _run
await self.should_shutdown()
File "/home/g21520667/OpenWPM/openwpm/storage/storage_controller.py", line 273, in should_shutdown
await asyncio.sleep(STATUS_UPDATE_INTERVAL)
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/asyncio/tasks.py", line 639, in sleep
return await future
^^^^^^^^^^^^
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/site-packages/multiprocess/process.py", line 314, in _bootstrap
self.run()
File "/home/g21520667/OpenWPM/openwpm/browser_manager.py", line 310, in close_browser_manager
status = self.status_queue.get(True, 30)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/site-packages/multiprocess/queues.py", line 116, in get
if not self._poll(timeout):
File "/home/g21520667/OpenWPM/openwpm/utilities/multiprocess_utils.py", line 44, in run
mp.Process.run(self)
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/site-packages/multiprocess/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/g21520667/OpenWPM/openwpm/storage/storage_controller.py", line 356, in run
asyncio.run(self._run(), debug=True)
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/asyncio/runners.py", line 123, in run
raise KeyboardInterrupt()
KeyboardInterrupt
^^^^^^^^^^^^^^^^^^^
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/site-packages/multiprocess/connection.py", line 259, in poll
return self._poll(timeout)
^^^^^^^^^^^^^^^^^^^
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/site-packages/multiprocess/connection.py", line 426, in _poll
r = wait([self], timeout)
^^^^^^^^^^^^^^^^^^^^^
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/site-packages/multiprocess/connection.py", line 933, in wait
ready = selector.select(timeout)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/g21520667/mambaforge/envs/openwpm/lib/python3.11/selectors.py", line 415, in select
fd_event_list = self._selector.poll(timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
Can you help me?
After 643 was resolved,
demo.py
not running on Ubuntu 18.