aio-libs / janus

Thread-safe asyncio-aware queue for Python
Apache License 2.0
827 stars 50 forks source link

Queue closing does not affect sync .put() calls in waiting state #237

Open kc41 opened 4 years ago

kc41 commented 4 years ago

Hi! I found some potentially unexpected behaviour of queue closing. If thread producer blocks on attempt to sync put to queue and we close queue in another control thread, thread producer will wait forever. I suppose that expected behaviour should be a RuntimeError in sync put() method on queue closing. What do you think about it?

Here is a code to reproduce this situation:

import asyncio
import logging
from concurrent.futures.thread import ThreadPoolExecutor
from queue import Queue

import janus

logging.basicConfig(format='%(threadName)-12s: %(message)s', level=logging.DEBUG)

async def main(tpe):
    hybrid_q = janus.Queue(maxsize=1)

    def some_long_job(q: Queue):
        logging.info("Job is running")
        for i in range(int(1e6)):
            try:
                logging.info("Putting to q: %s", i)
                q.put(f"item_{i}")
                logging.info("Putting to q done: %s", i)
            except Exception as ex:
                logging.exception(ex)
                raise

    job = asyncio.ensure_future(asyncio.get_event_loop().run_in_executor(tpe, some_long_job, hybrid_q.sync_q))
    await asyncio.sleep(0.5)

    job.cancel()
    logging.info("Closing queue")
    hybrid_q.close()

    logging.info("Waiting q to be closed")
    await hybrid_q.wait_closed()
    logging.info("Queue was closed")

if __name__ == '__main__':
    tpe = ThreadPoolExecutor(thread_name_prefix="TPE_")
    asyncio.run(main(tpe))
    logging.info("Shutting down TPE")
    tpe.shutdown(wait=True)
    logging.info("TPE was shut down")
asvetlov commented 4 years ago

Thanks for the question.

I'm not sure if the exception raising is good for this case: it just means that every q.put() should be wrapped in try/except because literally every call may raise RuntimeError. It looks very annoying.

dplusic commented 4 years ago

What about notifying q._sync_not_full in q.close()? We may get RuntimeError for closing as usual if q._check_closing() is called after _sync_not_full.wait() in sync_q.put().

asvetlov commented 4 years ago

Fixed by #267

dplusic commented 4 years ago

@asvetlov How could #267 fix this?

asvetlov commented 4 years ago

Ooops. Sorry, you are right. Hard day for me.

asvetlov commented 4 years ago

Please feel free to propose a pull request.

x42005e1f commented 3 weeks ago

I also found a related problem.

def threaded(sync_q):
    print("before")
    sync_q.put(1)
    sync_q.put(2)
    print("after")

queue = janus.Queue(1)

Thread(target=threaded, args=[queue.sync_q]).start()

for _ in range(min(32, (os.cpu_count() or 1) + 4)):
    loop.run_in_executor(None, time.sleep, 1)

await queue.async_q.get()
queue.close()

In this example, we fill the default executor with the maximum number of callbacks, which causes get() to not notify the thread immediately - quite a common state for a highly loaded application. The close() call cancels the scheduled notification, causing the thread's second put() to never complete - "after" will never be printed. However, if we remove the queue closing, everything is fine.

This behavior is caused by the _notify_sync_not_full() change in b77ca59. Meanwhile _notify_sync_not_empty() has no such semantics (why?). Either both methods should add futures to _pending, or both shouldn't, because otherwise this distinction doesn't make sense.