Tinche / aiofiles

File support for asyncio
Apache License 2.0
2.67k stars 149 forks source link

Interface inconsistent with asyncio.StreamWriter/Reader #188

Open jgarvin opened 1 month ago

jgarvin commented 1 month ago

I'm not sure if it existed when aiofiles was originally written, but nowadays asyncio has StreamWriter and StreamReader base classes, and it would be more consistent with the rest of the async APIs if open() returned subclasses of these. In particular, the write method is not async, and instead drain is provided which is sync. I believe the assumption is that the StreamWriter writes to the buffer of some Transport which is the thing that is actually connected to the event loop, detects device readiness and then drains the buffer. There is an async drain method that users can manually await, but IIUC it's expected that usually the Transport is doing that for you.

mjpieters commented 1 day ago

Stream APIs have been part of asyncio from the start. StreamWriter and StreamReader are specific interfaces required by asyncio networking, and should not be compared with how file objects work.

E.g. stream buffers are automatically emptied as network sockets are ready to send more. The .drain() method is merely a method to wait for the drain to complete and to provide 'back pressure', letting your application know when network data is backing up so you can pause other parts of your app.

You said:

There is an async drain method that users can manually await, but IIUC it's expected that usually the Transport is doing that for you.

No, you should await on drain() because if you don't you can end up overflowing network buffers, especially if you are pushing data to a stream writer that is coming from a faster source stream that in turn could just wait for capacity to free up. This is a fundamental principal of good networking code.

As the asyncio streams documentation states at the top:

Streams are high-level async/await-ready primitives to work with network connections.

bold emphasis mine.

As such, their interface is not really designed for use with filesystem file objects. The aiofiles project rightly doesn't try to fit Python's io / os / tempfile APIs into streams.

jgarvin commented 1 day ago

Hmm, I know aiofiles and stdlib are different projects but I find the situation confusing because streams are typically an abstraction over all of sockets+files+pipes. The point of it being a stream is usually so you don't have to know the underlying IO device/type, it's just something you take bytes out of or throw them into.