Closed Nicolas-SB closed 2 months ago
@MarcoGorelli this one seems like it's right up your alley. @Nicolas-SB just a small nit, pandas doesn't have a native parquet reader, it just uses pyarrow so your pandas tests are just repeating pyarrow.
+1
Hey everyone, may I try to fix this? :)
sure @jstet go ahead
Checks
Reproducible example
Log output
Issue description
We have a parquet file that was written by pyarrow with a column containing date, time and timezone. On the pyarrow side, this column has the type "pa.timestamp("us", "utc"). This parquet file can be read by pyarrow and pandas without problems, but polars throws the "unable to parse time zone" exception.
Our assumption is that in "_try_from_arrow_unchecked" the "validate_time_zone" does not know "utc" (lowercase) but expects "UTC" (uppercase).
Expected behavior
Since pyarrow accepts lowercase "utc", we expect polars to be able to read it as well.
Installed versions