MKuranowski / aiocsv

Python: Asynchronous CSV reading/writing
https://pypi.org/project/aiocsv/
MIT License
67 stars 9 forks source link

Reader and DictReader do not read entire csv file #3

Closed VStoilovskyi closed 3 years ago

VStoilovskyi commented 3 years ago
async def main():
    async with aiofiles.open("local_storage/csvs/example.csv", mode="r", newline="") as afp:
        headers = ['_id', 'balanceId', 'group', 'from', 'till', 'epicSpins', 'giftForEpic', 'userParams', 'enabled']
        async for row in AsyncReader(afp, dialect=Dialect.WRITE_DIALECT, fieldnames=headers):
            print(row)  # row is a dict

Code above prints only csv file headers if AsyncReader class is used or dict[header:header]while using AsyncDictReader

Script result:

DEBUG:asyncio:Using selector: KqueueSelector
{'_id': '_id', 'balanceId': 'balanceId', 'group': 'group', 'from': 'from', 'till': 'till', 'epicSpins': 'epicSpins', 'giftForEpic': 'giftForEpic', 'userParams': 'userParams', 'enabled': 'enabled'}
DEBUG:asyncio:Close <_UnixSelectorEventLoop running=False closed=False debug=True>

MacOS BigSur, Python 3.9.1

MKuranowski commented 3 years ago

Could you please attach a minimal CSV file that's causing the issue?

VStoilovskyi commented 3 years ago

Here's my .csv example

_id;balanceId;group;from;till;epicSpins;giftForEpic;userParams;enabled
5fd1d57f7aa577392017cc31;5fd1d57f7aa577392017cc31;1;2020-12-27 05:00:00;2020-12-28 04:59:59;3;1;"{""day"":{""lte"":4}}";1

In module with async function I register Dialect

csv.register_dialect('myDialect', delimiter=';', quoting=csv.QUOTE_ALL, escapechar='', quotechar='"')

Script entrypoint:

if __name__ == '__main__':
    logging.basicConfig(level=logging.DEBUG)
    asyncio.run(main(), debug=True)
MKuranowski commented 3 years ago

Can't reproduce.

test.py:

from aiocsv import AsyncReader
from pprint import pprint
import aiofiles
import asyncio
import csv

async def main():
    csv.register_dialect("semicolon", delimiter=";", quoting=csv.QUOTE_ALL, escapechar="", quotechar='"')
    async with aiofiles.open("data.csv", mode="r", newline="") as afp:
        pprint([r async for r in AsyncReader(afp, dialect="semicolon")])

if __name__ == "__main__":
    asyncio.run(main())
$ cat data.csv
_id;balanceId;group;from;till;epicSpins;giftForEpic;userParams;enabled
5fd1d57f7aa577392017cc31;5fd1d57f7aa577392017cc31;1;2020-12-27 05:00:00;2020-12-28 04:59:59;3;1;"{""day"":{""lte"":4}}";1

$ file data.csv
data.csv: ASCII text, with CRLF line terminators

$ python test.py
[['_id',
  'balanceId',
  'group',
  'from',
  'till',
  'epicSpins',
  'giftForEpic',
  'userParams',
  'enabled'],
 ['5fd1d57f7aa577392017cc31',
  '5fd1d57f7aa577392017cc31',
  '1',
  '2020-12-27 05:00:00',
  '2020-12-28 04:59:59',
  '3',
  '1',
  '{"day":{"lte":4}}',
  '1']]

Maybe you have wrong line endings in the CSV? The default row separator is CRLF and aiocsv goes by the given line terminators (this isn't mentioned in the readme and this is a difference to how csv works, that I should fix)

VStoilovskyi commented 3 years ago

You were right, assigning line terminator explicitly fixed my problem. csv files on my mac had \n as line terminator So I just edit csv dialect to:

csv.register_dialect('myDialect', delimiter=';', quoting=csv.QUOTE_ALL, escapechar='', quotechar='"',
                     lineterminator='\n')

But yeah, it would be great if you mention this case in readme. .

        # Guess the line terminator
        self._line_sep = self._csv_reader.dialect.lineterminator or "\n"

The main problem in code I guess is that in _csvreader.dialect.lineterminator \r\n is set by default. So code above is always uses default lineterminator(in my case)

Thanks for help, you may close the issue