JuliaData / Feather.jl

Read and write feather files in pure Julia
https://juliadata.github.io/Feather.jl/stable
Other
109 stars 27 forks source link

`InexactError` when Int64 offset value tries to convert to Int32 #64

Open ExpandingMan opened 6 years ago

ExpandingMan commented 6 years ago

The error occurs here on dataframes with sufficiently large columns. The most obvious way to fix this would be to change all of the offsets to Int64, but does the feather format even support that? Is this a fundamental limitation? If so that seems really bad, because the dataset was only about 20GB.

nalimilan commented 6 years ago

AFAIK Feather/Arrow intentionally uses Int32 to force people to use multiple blocks to store large arrays. I don't really understand that reasoning, but it's stated here: https://github.com/apache/arrow/blob/master/format/Layout.md#array-lengths

ExpandingMan commented 6 years ago

Supposedly this has been resolved see here and here. Not sure what is required on our end to resolve it yet.