openeduhub / metalookup

Provide metadata about domains w.r.t accessibility, licencing, adds, etc.
GNU General Public License v3.0
5 stars 0 forks source link

Cache warmup stops prematurely on timeouts #125

Closed MRuecklCC closed 2 years ago

MRuecklCC commented 2 years ago

Log example:

extractor_1   | Process SpawnProcess-1:
extractor_1   | Traceback (most recent call last):
extractor_1   |   File "/usr/local/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
extractor_1   |     self.run()
extractor_1   |   File "/usr/local/lib/python3.9/multiprocessing/process.py", line 108, in run
extractor_1   |     self._target(*self._args, **self._kwargs)
extractor_1   |   File "/usr/local/lib/python3.9/site-packages/metalookup/caching/warmup.py", line 54, in warmup
extractor_1   |     asyncio.run(tasks(queue=deque(urls)))
extractor_1   |   File "/usr/local/lib/python3.9/asyncio/runners.py", line 44, in run
extractor_1   |     return loop.run_until_complete(main)
extractor_1   |   File "/usr/local/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
extractor_1   |     return future.result()
extractor_1   |   File "/usr/local/lib/python3.9/site-packages/metalookup/caching/warmup.py", line 50, in tasks
extractor_1   |     await asyncio.gather(*[task(id=id) for id in range(n_tasks)])
extractor_1   |   File "/usr/local/lib/python3.9/site-packages/metalookup/caching/warmup.py", line 42, in task
extractor_1   |     await client.post(
extractor_1   |   File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 559, in _request
extractor_1   |     await resp.start(conn)
extractor_1   |   File "/usr/local/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 913, in start
extractor_1   |     self._continue = None
extractor_1   |   File "/usr/local/lib/python3.9/site-packages/aiohttp/helpers.py", line 721, in __exit__
extractor_1   |     raise asyncio.TimeoutError from None
extractor_1   | asyncio.exceptions.TimeoutError

Timeour errors need to be caught, and dealt with - probably with some exponential back-of / retry logic.