openwpm / OpenWPM

A web privacy measurement framework
https://openwpm.readthedocs.io
Other
1.33k stars 313 forks source link

Handling RuntimeError from Storage Controller in OpenWPM #1079

Open MohammadMahdiJavid opened 9 months ago

MohammadMahdiJavid commented 9 months ago

Hi,

I am encountering a specific RuntimeError when using OpenWPM and I am seeking advice on the correct approach to handle it. The error occurs as follows:

  File "demo.py", line 470, in <module>
    manager.execute_command_sequence(command_sequence)

  File "openwpm/task_manager.py", line 435, in execute_command_sequence
    agg_queue_size = self.storage_controller_handle.get_most_recent_status()

  File "storage/storage_controller.py", line 625, in get_most_recent_status
    raise RuntimeError(

RuntimeError: No status update from the storage controller process for 4859 seconds.

I attempted to handle this exception with a try-except block, but it doesn't seem to be effective:

for command_sequence in command_sequences : 
  try :
      manager.execute_command_sequence(command_sequence)
  except : 
      continue

I would appreciate any guidance on the correct way to handle this exception. Is there a specific approach or pattern recommended for handling such timeouts or lack of status updates in OpenWPM? Any insights or suggestions would be greatly appreciated.

Thank you.

https://github.com/openwpm/OpenWPM/blob/25c537eb8eb8ac1e7b7f399ba5830bb37859332d/openwpm/task_manager.py#L410-L429

https://github.com/openwpm/OpenWPM/blob/25c537eb8eb8ac1e7b7f399ba5830bb37859332d/openwpm/storage/storage_controller.py#L536-L555

MohammadMahdiJavid commented 9 months ago

since self.status_queue is empty, so again it's going to say

     if (time.time() - self._last_status_received) > STATUS_TIMEOUT: 

and is going to use previous self._last_status_received for the new one and raise the exception again

MohammadMahdiJavid commented 9 months ago

proposed fix:

if self.status_queue.empty() : 
    self.status_queue.put(0)

or maybe i'm handling the exception not correctly

vringar commented 9 months ago

RuntimeError: No status update from the storage controller process for 4859 seconds.

This implies that the following code has not been running for more than an hour, which is deeply troubling as it should run every 5 seconds.

https://github.com/openwpm/OpenWPM/blob/25c537eb8eb8ac1e7b7f399ba5830bb37859332d/openwpm/storage/storage_controller.py#L237-L259

This suggests that you are performing long running blocking operations in the storage controller which breaks it. The storage process is a single threaded asynchronous execution environment. Any task (including any storage provider) that doesn't yield control can lock up the whole process.

I would assume that whatever is blocking the process is also preventing any data from getting saved out.

This can not be fixed anywhere outside of the storage (controller) process