uwdata / arquero

Query processing and transformation of array-backed data tables.
https://idl.uw.edu/arquero
BSD 3-Clause "New" or "Revised" License
1.22k stars 64 forks source link

Feather format written by PyArrow 7.0.0 cannot be opened in Arquero #279

Open josesho opened 2 years ago

josesho commented 2 years ago

Hi,

with pyarrow v7.0.0:

import seaborn as sns
import pyarrow as arrow
iris = sns.load_dataset("iris")
pf.write_feather(iris, 'iris.feather')

The resultant file is then loaded into arquero in an Observable notebook. The table has the correct columnar structure but the internal data is in a different format.

Screenshot 2022-05-20 at 10 16 57

I think this is related to #270? And hopefully resolved in #277? Hope it can get PRed soon!

PS I use pyarrow to (lightly) munge and convert a 4GB dataset to a smaller 36MB feather file, and I'd like to start building a dashboard with it, using arquero as the analytics engine.

Thanks for all the hard work!

josesho commented 2 years ago

I see that with arquero>=5.0, this issue is resolved, with the caveat that only uncompressed Feather files can be read in for the time being!