MKuranowski / aiocsv

Python: Asynchronous CSV reading/writing
https://pypi.org/project/aiocsv/
MIT License
67 stars 9 forks source link

Compatibility with asyncio TaskGroups #16

Closed crachefeu closed 12 months ago

crachefeu commented 12 months ago

This might be linked to issue #7, but I am unsure.

In my use case I use aiocsv to write results to a single file from multiple result streams (from multiple databases). I noticed that I have many more lines in my result file than expected when I use create_task in asyncio.TaskGroup (python doc here, see example below). If I simply await the writer, I don't have the issue.

System specifications

I am running this on python 3.11.0, on MacOS 12.3.1, M1 chip with aiocsv 1.2.4.

Example Here is a simple example to illustrate the issue:

import asyncio
from aiocsv import AsyncWriter
import aiofiles

async def main(iterations):
    async with aiofiles.open('test.csv', 'w', encoding="utf-8") as f:
        writer =  AsyncWriter(f)
        async with asyncio.TaskGroup() as tg:
            for i in range(iterations):
                tg.create_task(writer.writerow([i, "ben","and jerry's"]))

if __name__ == "__main__":
    asyncio.run(main(10))

I would expect the snippet above to write 10 lines to a CSV file, instead I get 55.

Am I missing something obvious here? Thanks in advance for your help.

MKuranowski commented 12 months ago

You just create one big race by allowing 10 coroutines to use the same AsyncWriter concurrently. Synchronize access to the writer with an asyncio.Lock or something.