openwpm / OpenWPM

A web privacy measurement framework
https://openwpm.readthedocs.io
Other
1.34k stars 315 forks source link

Assertion Error on tab restart #907

Closed MaazBinMusa closed 3 years ago

MaazBinMusa commented 3 years ago

Hi, I am simply using the latest openwpm to crawl list of top 100K sites. I install everything and it works.

Description: Ubuntu 18.04.5 LTS Release: 18.04 Codename: bionic

An hour into the experiment this happens (I reran the experiment and I saw that it crashed again and both times it happened on the 600th crawl):

Traceback (most recent call last):
  File "/home/data/maaz/openwpm/demo.py", line 85, in <module>
    manager.execute_command_sequence(command_sequence)
  File "/home/data/maaz/openwpm/openwpm/task_manager.py", line 616, in execute_command_sequence
    thread = self._start_thread(browser, command_sequence)
  File "/home/data/maaz/openwpm/openwpm/task_manager.py", line 372, in _start_thread
    self._check_failure_status()
  File "/home/data/maaz/openwpm/openwpm/task_manager.py", line 359, in _check_failure_status
    raise exc.with_traceback(tb)
  File "/home/data/maaz/openwpm/openwpm/browser_manager.py", line 505, in BrowserManager
    command.execute(
  File "/home/data/maaz/openwpm/openwpm/commands/browser_commands.py", line 493, in execute
    tab_restart_browser(webdriver)
  File "/home/data/maaz/openwpm/openwpm/commands/browser_commands.py", line 112, in tab_restart_browser
    assert len(webdriver.window_handles) == 1
AssertionError

Right before this happened, I got these warnings

storage_controller   - INFO     - Terminating handler, because the underlying socket closed
storage_controller   - INFO     - Terminating handler, because the underlying socket closed
storage_controller   - INFO     - Terminating handler, because the underlying socket closed
storage_controller   - INFO     - Terminating handler, because the underlying socket closed
storage_controller   - INFO     - Terminating handler, because the underlying socket closed
storage_controller   - INFO     - Terminating handler, because the underlying socket closed
storage_controller   - INFO     - Received shutdown signal!
storage_controller   - INFO     - Awaiting all tasks for visit_id 2853014443642182
sql_provider         - WARNING  - Visit with visit_id 2853014443642182 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 3506648308045640
sql_provider         - WARNING  - Visit with visit_id 3506648308045640 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 8018336261893049
sql_provider         - WARNING  - Visit with visit_id 8018336261893049 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id -1
sql_provider         - WARNING  - Visit with visit_id -1 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 3111861480315613
sql_provider         - WARNING  - Visit with visit_id 3111861480315613 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 3885118769428323
sql_provider         - WARNING  - Visit with visit_id 3885118769428323 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 497764880975828
sql_provider         - WARNING  - Visit with visit_id 497764880975828 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 4270159864242582
sql_provider         - WARNING  - Visit with visit_id 4270159864242582 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 6760426766679700
sql_provider         - WARNING  - Visit with visit_id 6760426766679700 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 8961377448992377
sql_provider         - WARNING  - Visit with visit_id 8961377448992377 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 3965406499461189
sql_provider         - WARNING  - Visit with visit_id 3965406499461189 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 1831921873745174
sql_provider         - WARNING  - Visit with visit_id 1831921873745174 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 2674473531056103
sql_provider         - WARNING  - Visit with visit_id 2674473531056103 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 4207463580646241
sql_provider         - WARNING  - Visit with visit_id 4207463580646241 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 6873361876354698
sql_provider         - WARNING  - Visit with visit_id 6873361876354698 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 6943443783216465
sql_provider         - WARNING  - Visit with visit_id 6943443783216465 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 3697820437353834
sql_provider         - WARNING  - Visit with visit_id 3697820437353834 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 7675309726874817
sql_provider         - WARNING  - Visit with visit_id 7675309726874817 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 2858826860947996
sql_provider         - WARNING  - Visit with visit_id 2858826860947996 got interrupted
storage_controller   - INFO     - Awaiting all tasks for visit_id 5552794902783188
sql_provider         - WARNING  - Visit with visit_id 5552794902783188 got interrupted

Also can someone help me understand what this warning mean?

storage_controller   - INFO     - Terminating handler, because the underlying socket closed
vringar commented 3 years ago

Also can someone help me understand what this warning mean? storage_controller - INFO - Terminating handler, because the underlying socket closed

Sorry, that this seems like a warning. It only means that a browser shut down and closed the connection to the storage_controller.

An hour into the experiment this happens (I reran the experiment and I saw that it crashed again and both times it happened on the 600th crawl):

Could you share the website that was being visited at that time? Because this means that this website is breaking OpenWPM.

vringar commented 3 years ago

Right before this happened, I got these warnings

From my understanding this only happens because the TaskManager is a context manager and calls it's close method when the exception bubbles out of the with statement. The actual error should be happening further up in the logs.

MaazBinMusa commented 3 years ago

Also can someone help me understand what this warning mean? storage_controller - INFO - Terminating handler, because the underlying socket closed

Sorry, that this seems like a warning. It only means that a browser shut down and closed the connection to the storage_controller.

An hour into the experiment this happens (I reran the experiment and I saw that it crashed again and both times it happened on the 600th crawl):

Could you share the website that was being visited at that time? Because this means that this website is breaking OpenWPM.

Hi, thanks for the consistent and quick replies

This is the site that crashes the latest openwpm build. This throws a critical bug

http://www.yts.mx

vringar commented 3 years ago

This has been partially fixed by #945. The crawl should now continue running.