Open sndpgm opened 1 year ago
The similar result has occurred in AWS S3.
>>> import polars as pl
>>> df_ng = pl.read_csv("s3://my_bucket/path/to/test_a1.csv.gz")
>>> df_ng
shape: (0, 1)
┌─────────────────────────────┐
│ <�d�test_a1.csv J�I�I�J�1ԩ… │
│ --- │
│ str │
╞═════════════════════════════╡
└─────────────────────────────┘
First decompress the file.
pl.read_csv
does not support compressed files?
Looks like pl.read_csv
doesn't support gz files from the cloud storage. Probably some issue with the metadata
I'm happy to look into it and work on this item.
I'm actually reading csv.gz
from local and AWS. Is this issue solved?
Polars version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of Polars.
Issue description
csv.gz
file in Google Storage (GS) bucket cannot be properly read usingpl.read_csv
. The results appear to be garbled:Reproducible example
Expected behavior
The expected results are the ones of reading the same file in local PC (file data is linked in the bellow):
test_a1.csv.gz
Installed versions