This is something I'm still worried about, recall this issue on Feather.jl. No, I haven't had any communication with the Arrow community yet.
Confusingly Feather seems to violate the Arrow format in several places. From what I could gather, it seems almost like Feather files really do have arrays with length greater than typemax(Int32), but they use the ability of the C++ arrow package to pull data in chunks to get around that somehow.
Note that I have "silently" changed all of the array length values in Arrow.jl to be Int rather than Int32. This would make it very easy for us to cheat in some cases, but of course offsets are still Int32.
So I don't know. This is a major concern, but I haven't really had time to reach out to the arrow community on this yet.
This is something I'm still worried about, recall this issue on Feather.jl. No, I haven't had any communication with the Arrow community yet.
Confusingly Feather seems to violate the Arrow format in several places. From what I could gather, it seems almost like Feather files really do have arrays with length greater than
typemax(Int32)
, but they use the ability of the C++ arrow package to pull data in chunks to get around that somehow.Note that I have "silently" changed all of the array length values in Arrow.jl to be
Int
rather thanInt32
. This would make it very easy for us to cheat in some cases, but of courseoffsets
are stillInt32
.So I don't know. This is a major concern, but I haven't really had time to reach out to the arrow community on this yet.
See related issues here and here.