gevent / gevent

Coroutine-based concurrency library for Python
http://gevent.org
Other
6.25k stars 937 forks source link

Resource pool leak due to interaction between gevent.timeout and queue.Queue #1875

Open RoganMurley opened 2 years ago

RoganMurley commented 2 years ago

Description:

It is a common pattern to use a queue as a resource pool. For example see the following:

from queue import Queue

class Resource:
    pass

class ResourcePool:
    def __init__(self, size):
        self.pool = Queue()
        for _ in range(size):
            self.pool.put(Resource())

    def acquire(self):
        return self.pool.get()

    def release(self, resource):
        self.pool.put(resource)

    def use(self):
        resource = self.pool.get()
        try:
            print(f'I am using the {resource}')
        finally:
            self.release()

Unfortunately this pattern leaks resources when a standard lib queue interacts with gevent.timeout. This is because the standard library queue can context switch during a non-blocking put, which could raise a timeout exception while trying to release a resource. This doesn't happen with gevent.Queue rather than queue.Queue because non-blocking puts are atomic, meaning a context switch won't occur until the resource has been fully released.

Here's an example where I fixed such an issue in a Thrift Connection Pool. I am now seeing more connection leak issues with other pools such as those used by Kombu and Redis.

Queues other than SimpleQueue are not patched by any of the gevent patch functions, so I imagine this resource leak is common. It was a surprise to me to find that patch_all and patch_queue only patch queue.SimpleQueue. The only reason I could find as to why is this comment mentioning that there are native thread use cases for the original objects.

This raises a bunch of questions:

embray commented 5 months ago

@RoganMurley Interesting analysis, thanks for this. I think I might be having the same problem but not sure yet.