internetarchive / warcprox

WARC writing MITM HTTP/S proxy
378 stars 54 forks source link

Handle ValueError when trying to close WARC file #140

Closed vbanos closed 4 years ago

vbanos commented 4 years ago

We get a lot of the following error in production and warcprox becomes totally unresponsive when this happens.

CRITICAL:warcprox.writerthread.WarcWriterProcessor:WarcWriterProcessor(tid=16646) will try to continue after unexpected error
Traceback (most recent call last):
  File "/opt/spn2/lib/python3.5/site-packages/warcprox/__init__.py", line 140, in _run
    self._get_process_put()
  File "/opt/spn2/lib/python3.5/site-packages/warcprox/writerthread.py", line 60, in _get_process_put
    self.writer_pool.maybe_idle_rollover()
  File "/opt/spn2/lib/python3.5/site-packages/warcprox/writer.py", line 233, in maybe_idle_rollover
    w.maybe_idle_rollover()
  File "/opt/spn2/lib/python3.5/site-packages/warcprox/writer.py", line 188, in maybe_idle_rollover
    self.close()
  File "/opt/spn2/lib/python3.5/site-packages/warcprox/writer.py", line 169, in close
    fcntl.lockf(self.f, fcntl.LOCK_UN)
ValueError: I/O operation on closed file

Current code handles IOError. We also need to handle ValueError to address this.

vbanos commented 4 years ago

I tried to replicate this problem in my dev machine without success. I run warcprox, wrote a few records in a WARC and then deleted the WARC file manually. warcprox didn't have an issue. It created a new file and moved on.

dvanduzer commented 4 years ago

If I read the code correctly, it sounds like handling this exception would mean files that were already closed (forcibly by an external process?) will now also be renamed to remove the .open, instead of leaving the filename to signify that "something weird happened here."

Is that right? (It sounds like we have no easy way to confirm this in the test environment.)

nlevitt commented 4 years ago

I think this is a good change, but I'm wondering, why not just catch Exception, rather than try to enumerate specific exception types?

Separately, we should try to figure out why this happens.

vbanos commented 4 years ago

From my experience, the more specific we are with exception handling, the better we understand what is going on and we can fix it eventually. Specific exception types imply specific problem types.

nlevitt commented 4 years ago

The exception is logged either way, catching Exception won't hide useful information

vbanos commented 4 years ago

OK NP, changed it to catch Exception.

nlevitt commented 4 years ago

Thanks!