Closed geraldog closed 3 months ago
Draft fix is at #474
Fixed in 0.19.0
Hi @sonic182
I'm test-crawling the top 1 million Cloudflare Radar domains.
In the end we alleviated the problem a lot but it seems after a million domains I end up with around 20 RuntimeError's. Not much, maybe one every 50,000 domains or so but still worth fixing on #483
Describe the bug Persistent occasions where my crawler gets:
RuntimeError: readuntil() called while another coroutine is already waiting for incoming data
The stack trace is irrelevant to trace the bug here. It comes from https://github.com/python/cpython/blob/d9efa45d7457b0dfea467bb1c2d22c69056ffc73/Lib/asyncio/streams.py#L525 but that itself explains little.
After days of coding and tracing with print() I found out that even cancelling the waiter so we don't raise the RuntimeError on streams.py is pointless. And that the real reason for the bug is connection.close() is being called twice from dfferent code-paths, a concurrency mess.
To Reproduce Steps to reproduce the behavior:
Expected behavior Not raising RuntimeError by calling readuntil() or read() - any of the stream reading awaitables that consume from the StreamReader buffer of bytes object - twice on top of each other.
Screenshots None
Desktop (please complete the following information): Not applicable
Smartphone (please complete the following information): Not applicable
Additional context Hi @sonic182 and sorry for the delay in filing the Issue. I wanted to have a fix before discussing any of this. I have a draft of a fix. Will file the PR within today. Thanks for Everything!