I observed that parquet microsecond or millisecond-precision timestamps are read by the second in postgres.
Steps to reproduce, in python :
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq
n = 100
df = pd.DataFrame({"time": [pd.Timestamp(datetime.datetime(2000,1,1)+datetime.timedelta(seconds=0.101*i)) for i in range(n)],
"value_float": [0+i/100 for i in range(n)]})
table = pa.Table.from_pandas(df, preserve_index=False)
pq.write_table(table, '/path/to/test.parquet", coerce_timestamps='ms')
Hi,
I observed that parquet microsecond or millisecond-precision timestamps are read by the second in postgres.
Steps to reproduce, in python :
Then, in postgres:
After analysis from @ahoy-jon the problem comes from https://github.com/adjust/parquet_fdw/blob/master/src/common.hpp#L27 This solves the problem for milliseconds:
If interested, I can provide a patch for other time unit as well.
Regards,