DominicBurkart / wikipedia-revisions

download every wikipedia edit
BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

ignored exception bug #16

Open DominicBurkart opened 4 years ago

DominicBurkart commented 4 years ago

For some reason, an error fromResponse.iter_content that is usually associated with a generator being polled twice in different threads appears sporadically. I think that the error may be coming from somewhere else, since each response is handled in a unique worker thread, and since the error seems to appear. When the error case is triggered, all extractors begin extracting and then stall. When the program is killed and restarted, it functions normally.

It seems like the part of the program where the error is triggered is not directly related to the message associated with the error. I haven't yet localized exactly where the error is coming from, though.

Example error:

Exception ignored in: finalizer of <generator object Response.iter_content.<locals>.generate at 0x000000010a8a9aa0> ValueError: generator already executing

os: high sierra interpreter: pypy related pr: only came up after https://github.com/DominicBurkart/wikipedia-revisions/commit/b4b70977ee900410e780461a5ec5d1f62c0c5d68 / https://github.com/DominicBurkart/wikipedia-revisions/commit/2cee01c9ecb9329f0c8100a4702014efc48a3e0d

DominicBurkart commented 4 years ago

Potentially related error:

Exception ignored in: finalizer of <generator object download_and_parse_files at 0x000000010f080060>
Traceback (most recent call last):
  File "/[...]/wikipedia-revisions-scraper/wikipedia_download.py", line 328, in download_and_parse_files
    yield lambda: parse_one_file(filename)
GeneratorExit

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/[...]/wikipedia_download.py", line 328, in download_and_parse_files
    yield lambda: parse_one_file(filename)
  File "/usr/local/Cellar/pypy3/7.3.1_1/libexec/lib-python/3/concurrent/futures/_base.py", line 611, in __exit__
    self.shutdown(wait=True)
  File "/usr/local/Cellar/pypy3/7.3.1_1/libexec/lib-python/3/concurrent/futures/thread.py", line 152, in shutdown
    t.join()
  File "/usr/local/Cellar/pypy3/7.3.1_1/libexec/lib-python/3/threading.py", line 1053, in join
    raise RuntimeError("cannot join current thread")
RuntimeError: cannot join current thread
DominicBurkart commented 4 years ago

this bug reemerged today in a dockerized version of this app. Rebuilding and restarting the image was enough for it to work again as expected.