python-arq / arq

Fast job queuing and RPC in python with asyncio and redis.
https://arq-docs.helpmanual.io/
MIT License
2.14k stars 173 forks source link

Issue using redis_pool and watchgod reloading #315

Open sondrfos opened 2 years ago

sondrfos commented 2 years ago

Hello,

we have recently started using Arq in our project. Working with Arq has been a pleasure, easy to understand, nice documentation and easy to read code. Thanks for a nice piece of work :)

We ran into an issue when we wanted to use watchgod reloading of our code. Mysteriously, whenever we changed the code it tried to connect to redis using the default-parameters, and not our custom host and port values.

Debugging resulted in us finding either a bug, or something we think should be specified in the documentation.

Our WorkerSettings-class uses the redis_pool-argument to provide an already created redis_pool. This is due to us, in some cases, already having a pool created which we want to use, especially during testing.

The following code represents the reloading-loop of Arq:

async for _ in awatch(path, stop_event=stop_event):
    print("\nfiles changed, reloading arq worker...")
    worker.handle_sig(Signals.SIGUSR1)
    await worker.close()
    loop.create_task(worker.async_run())

the problem lies in the woker.close() as it closes our predefined pool.

    async def close(self) -> None:
        if not self._handle_signals:
            self.handle_sig(signal.SIGUSR1)
        if not self._pool:
            return
        await asyncio.gather(*self.tasks.values())
        await self.pool.delete(self.health_check_key)
        if self.on_shutdown:
            await self.on_shutdown(self.ctx)
        await self.pool.close(close_connection_pool=True)
        self._pool = None

Is this intended? Do we have to provide redis_settings as well as redis_pool in the WorkerSettings-class? Or is this a bug? From our perspective it would be nice if it was possible to avoid closing the pool whenever a watchgod-reload happened.

NixBiks commented 1 year ago

This might be related to my issue when I run --watch on the arq cli command. I sometimes get this error

Task exception was never retrieved
future: <Task finished name='Task-10' coro=<Worker.async_run() done, defined at /Users/nix/Projects/plx/TaskRunners/.venv/lib/python3.11/site-packages/arq/worker.py:308> exception=AttributeError("'NoneType' object has no attribute 'zrangebyscore'")>
Traceback (most recent call last):
  File "/Users/nix/Projects/plx/TaskRunners/.venv/lib/python3.11/site-packages/arq/worker.py", line 313, in async_run
    await self.main_task
  File "/Users/nix/Projects/plx/TaskRunners/.venv/lib/python3.11/site-packages/arq/worker.py", line 354, in main
    await self._poll_iteration()
  File "/Users/nix/Projects/plx/TaskRunners/.venv/lib/python3.11/site-packages/arq/worker.py", line 379, in _poll_iteration
    job_ids = await self.pool.zrangebyscore(
                    ^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'zrangebyscore'

Could be related since self.pool is None here. @sondrfos did you figure something out?

sondrfos commented 1 year ago

Hey @NixBiks,

We stopped using hot-reloading, so I don't know how you would fix it. Sorry