apache / arrow-nanoarrow

Helpers for Arrow C Data & Arrow C Stream interfaces
https://arrow.apache.org/nanoarrow
Apache License 2.0
149 stars 34 forks source link

refactor: Use inttypes.h macros instead of casts to print fixed-width integers #520

Closed WillAyd closed 3 weeks ago

WillAyd commented 3 weeks ago

Very pedantic change to the great work done in https://github.com/apache/arrow-nanoarrow/pull/507 just to prevent any issues on LLP64 systems

WillAyd commented 3 weeks ago

Sure happy to take a larger look. Do you know what the R issues were with ADBC? Would have to double check but I am fairly certain this is a C99 macro, so surprised that something wouldn't support it at this point

paleolimbot commented 3 weeks ago

Would have to double check but I am fairly certain this is a C99 macro, so surprised that something wouldn't support it at this point

I'll take a look tomorrow, but it was something about old versions of mingw having an inconsistency between the definition of PRId64 and the implementation of printf().

In any case, isolating this behaviour into utility functions is maybe still a good idea?

WillAyd commented 3 weeks ago

Are you thinking of just redefining PRId64 via macros or wrapping a new printf function? I'm a little wary of trying to make something not standards compliant become standards compliant. Doesn't feel like a great value add to nanoarrow, but of course open to whatever direction you want to take this in

paleolimbot commented 3 weeks ago

I think that the issue was that the mingw inttypes.h header ( https://github.com/msys2-contrib/mingw-w64/blob/master/mingw-w64-headers/crt/inttypes.h ) had some interactions with the format check attribute and __USE_MINGW_ANSI_STDIO:

https://github.com/apache/arrow-adbc/blob/acef3c826599c5d4c18849bfe2164d8d0ce53fbc/c/driver/common/utils.h#L34-L41

I don't think that the format check attribute is particularly important (it is opt-in via the NANOARROW_DEBUG define), so I think PRId64 is safe 🙂 . The old version of mingw that was causing problems was the basis for the R package build toolchain for R <= 3.6, but that version just went out of the tidyverse support matrix.

I was had been thinking something like #define NANOARROW_PR64(EXPR) ((int64_t)value) + ArrowErrorSet("Some big value is %" NANOARROW_PRID64, NANOARROW_PR64(some_value)), but since the issue was actually just the format check attribute and not the format itself, I don't think it's needed 🙂 .