JuliaData / Feather.jl

Read and write feather files in pure Julia
https://juliadata.github.io/Feather.jl/stable
Other
109 stars 27 forks source link

non `Int32` categorical references #80

Open ExpandingMan opened 6 years ago

ExpandingMan commented 6 years ago

This issue is to discuss how we deal with categorical references that are not Int32. This is a violation of the Arrow standard (this is not the only suspicious thing that the Feather format does). In Julia it is very easy to support references of any type, but my main concern is about consistency with other arrow implementations. Initially I thought that it was only Feather.jl that produced non Int32 references, but python feather seems to deal with them just fine, so I'm not sure what the real source of this is.

It does seem that some people have some data sitting around that actually does use non Int32 references.

My suggested solution for the time being is that we support reading non Int32 references, but for writing we only ever use Int32. I'm going to make the appropriate changes to Arrow.jl so that #78 implements this.

quinnj commented 6 years ago

I agree w/ that approach; support reading, but only produce valid Arrow.

sglyon commented 6 years ago

I also agree! Writing paper arrow, but reading stuff in the wild is a great balance