Closed ramses-lee closed 7 months ago
Can you try reading it as a file and stream? Maybe try pyarrow directly.
Not exactly sure what you meant, but I tested both parquet read_table() function as well as the pyarrow memory_map() function and both gave me an error.
Ahh, I fixed it. The file wasn't closed properly.
This works now.
import pyarrow as pa
with open('data/flights-200k.arrow', 'rb') as f:
buf = f.read()
with pa.ipc.open_file(buf) as reader:
df = reader.read_pandas()
print(df)
A general demonstration is outlined here in the google collar file: https://colab.research.google.com/drive/1oKhivD5T9Yi1gMl0_7dUwqVFqiNfD43k?usp=sharing
The 'flights-200k.arrow" is producing an error every time I tried to read in the file using Pandas package.