man-group / sparrow

C++20 idiomatic APIs for the Apache Arrow Columnar Format
Apache License 2.0
25 stars 12 forks source link

32 bit support fails #175

Open serge-sans-paille opened 3 months ago

serge-sans-paille commented 3 months ago

Sparrow seems to hard-code that the underlying archiotecture is 64bit, see build.log in https://koji.fedoraproject.org/koji/taskinfo?taskID=121660160

Klaim commented 3 months ago

It's more like the layout code is following the usual C++ container logic of assuming that container/arrays can't be bigger than what the native address space supports, therefore 32bit platforms have 32bit pointers and 32bit elements size value types. However ArrowArray impose 64bit sizes whatever the platform, and the difference is caught at compile-time when we attempt to move an iterator with an offset of 64bit type. Reproducible on Windows (and probably all platforms) by trying to build for 32bit.

Klaim commented 3 months ago

Arrow's doc specifies that the values of size types can be limited to 32bit even if the imposted representation type is 64bit int (and they recommend limitting to 2^31 - 1 signed integer) so that would match with what's supported by the C++ classic container interfaces, but we need some explicit conversions (and maybe a way to check values at runtime if requested). Working on it.