Tinche / aiofiles

File support for asyncio
Apache License 2.0
2.66k stars 150 forks source link

Performance: coarse or fine grain? #98

Closed Korijn closed 3 years ago

Korijn commented 3 years ago

Hi, thanks for this library. I have a question regarding performance.

Let's say you have a job that must iterate over 10 files and read them all into memory.

You could use this library, and you would have about 21 await calls; 1x listdir, 10x open and 10x read. Each call would be delegated to the thread pool executor by queueing them up.

Alternatively, you could write a function that does the work and schedule it as a single unit for the thread pool executor.

What are the performance implications of these two approaches? Does it matter?

I could imagine approach 1 could become slower for large numbers of IO operations due to the overhead of queue based communication with the thread pool executor, but I am unsure.

What do you think?

Tinche commented 3 years ago

Using the executor definitely has overhead, so if you were looking to eke out the maximum level of performance, using it once instead of many times would be more efficient. But if you're worried about the performance implications of this, you'd be better off moving file reading somewhere else entirely (like a CDN or a dedicated nginx instance). For relatively low traffic services, aiofiles should be fine.

Korijn commented 3 years ago

Ok, thanks for the insights, that confirms my thoughts!