Closed alvations closed 1 year ago
You are using the wrong method.
Basically you need to
df.export_parquet("file.parquet")
# or
df.export("file.parquet") # This will auto-use the above method by looking at the extensions specified
This, df.export_arrow("file.arrow")
exports to a different (arrow native) file format
Thanks for the quick reply! I get it the right write/read function and extensions now.
On these vaex and pyarrow version:
When reading a tsv file and exporting it to arrow, the arrow table couldn't be properly loaded by
pyarrow.read_table()
, e.g. given a file, e.g.s2t.tsv
:And exporting the tsv to arrow as such, then reading it back:
It throws the following error:
Is there some additional args/kwargs that should be added when exporting or reading the parquet files?
Or is the exporting to arrow bugged/broken somehow?