Closed tking320 closed 11 months ago
This is not much of a reproducible example, I'm afraid. Can you at least show the schema of the file. How was it made? Which column causes the issue?
This is the file that caused the error, you can try it.
This is the first time we have seen DELTA encoding (a parquet V2 feature) in a v1 data page. This should be fixable relatively easily, please stay tuned.
It would be useful if you could let us know the expected values of the "ts" column, as a test.
It would be useful if you could let us know the expected values of the "ts" column, as a test. you can use pandas to parse the file and the result will be correct
from pandas import read_parquet
df = read_parquet(f) df['ts'] = data['ts'].astype(str) df['upload_ts'] = data['upload_ts'].astype(str)
print(df)
That makes sense :). I was wrong in my initial assertion, we do handle DELTA already, but only for 32bit types (int, normally), so need to extend to 64bit.
Describe the issue:
When I use fastparquet to convert dataframe, an error occurs: When changing to a larger dtype, its size must be a advisor of the total size in bytes of the last axis of the array
Minimal Complete Verifiable Example:
Anything else we need to know?:
Environment: linux