JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.78k stars 5.49k forks source link

In Julia 1.11, jl_array_data() can return integers cast as pointers #56508

Open droodman opened 1 week ago

droodman commented 1 week ago

With the introduction of the GenericMemory type and the revamping of Array implementation, the interface for embedding in C has changed in breaking ways. As someone who has built and distributed a free package that embeds Julia in another software environment, I was surprised to learn that, semantic versioning conventions notwithstanding, the move from 1.10 to 1.11 is breaking.

In particular, the old one-argument jl_array_data() defined in julia.h has been replaced with a two-argument jl_array_data() and a one-argument jl_array_data(). But embedding applications compiled with per-1.11 Julia libraries will still fail if they use the old jl_array_data() while invoking Julia 1.11.

In addition, while the new jl_array_data_() and jl_array_data() are written to return pointers, they can also return integers cast as pointers. Most often in this case they will return 0/NULL. Then, dereferencing the return value will cause a segfault. Is this set-up intentional and optimal?

The new macro definitions in julia.h are

#define jl_array_data(a,t) ((t*)((jl_array_t*)(a))->ref.ptr_or_offset)
#define jl_array_data_(a) ((void*)((jl_array_t*)(a))->ref.ptr_or_offset)

As its name suggests, ptr_or_offset can be "an actual pointer or an offset (if T is zero size or a Union)."

The embedding documentation doesn't mention this complication. It just says "In order to access the data of x, we can use jl_array_data."

It would be nice for julia.h to supply macros/functions that behave as users would naively expect, always returning pointers to the start of the data. Otherwise, I think a clarification in the documentation would be very helpful.

Also, I think the embedding documentation should prominently state that "non-breaking" Julia updates (in the sense of semantic versioning) may be breaking for applications that embed Julia.

Or, a more ambitious take-away could be: maybe the embedding interface, like the module system, needs the spiritual equivalent of a public keyword, a set of interface functions and macros that are documented public API, and which won't break.

Thanks to the core developers for all they do.

Taaitaaiger commented 6 days ago

You can use jl_array_ptr instead of jl_array_data.

That said, I think that C interface currently has no stability guarantees should at least be acknowledged by the embedding documentation.