Closed Askill closed 3 months ago
Hi @Askill , thank you. Changing from '.txt' to '.csv' can be a breaking change, however I do appreciate the suggestion to set utf-8
encoding when reading/writing the files. Would like to resubmit your PR not only for createMeasurements.py but also all other scripts?
@ifnesi ah, yes you're right. I reverted the name change and added the encoding to all open() statements
Generating the data failed on Win11 with Python 3.12, with these changes it works.
Previous Error:
Creating measurement file 'measurements.txt' with 1000000000 measurements... 0%| | 0/100 [00:00<?, ?it/s]C:\projects\one-billion-row-challenge\1brc\createMeasurements.py:463: UserWarning: Polars found a filename. Ensure you pass a path to the file instead of a python file object when possible for best performance. data.write_csv(f, separator=sep, float_precision=1, include_header=False, ) 0%| | 0/100 [00:00<?, ?it/s] Traceback (most recent call last): File "C:\projects\one-billion-row-challenge\1brc\createMeasurements.py", line 507, in <module> measurement.generate_measurement_file( File "C:\projects\one-billion-row-challenge\1brc\createMeasurements.py", line 463, in generate_measurement_file data.write_csv(f, separator=sep, float_precision=1, include_header=False, ) File "C:\projects\one-billion-row-challenge\venv\Lib\site-packages\polars\dataframe\frame.py", line 2696, in write_csv self._df.write_csv( polars.exceptions.InvalidOperationError: file encoding is not UTF-8