libzfp cannot be found on Apple Silicon macOS

jvo203 commented 1 year ago

Trying to use ZfpCompression on the Apple Silicon macOS (M1 Ultra Mac Studio). However, despite ZfpCompression.jl supposedly compiling & installing its own version of zfp, Julia 1.9.1 errors out when calling zfp_compress:

UndefVarError(:libzfp)

Manually compiling & installing zfp into /usr/local/{include/lib} does not solve the problem either.

jvo203 commented 1 year ago

Extra information: ZfpCompression.jl appears to work fine on Intel macOS as well as in Linux. Only the Apple Silicon macOS is affected by the problem.

milankl commented 1 year ago

@giordano do you know where this problem could arise from? Maybe related: zfp released 1.0 in the meantime, I don't know how many interface changes there are that require changes here in this wrapper too, but I'm happy to test this. How can we trigger binary builder to get in a new release? Then we should probably set compat zfp_jll = 0.5.5 here and experiment with zfp_jll = 1

giordano commented 1 year ago

Our build of zfp is old, it predates by three days the official announcement of the Apple Silicon platform, we couldn't possibly support it back then :slightly_smiling_face: Yes, a new build of a newer version should solve this issue (as long as compilation doesn't present other problems)

milankl commented 1 year ago

https://github.com/JuliaPackaging/Yggdrasil/blob/fcc005a12d007f7d5d1a7b825308ace17466b6f4/Z/zfp/build_tarballs.jl#L6-L11

Do we just create a pr to Yggdrasil updating these?

giordano commented 1 year ago

Yes, and add ; julia_compat="1.6" to the last line.

milankl commented 1 year ago

https://github.com/JuliaPackaging/Yggdrasil/pull/6926

jvo203 commented 1 year ago

Thank you, that was quick! The upgraded build zfp_jll 1.0.0 fixes the problem on Apple Silicon.

milankl commented 1 year ago

Could you run the tests and report here if any fail? I haven’t checked what changed from 0.5.5 to 1.0.0

jvo203 commented 1 year ago

Another of my projects (using C / FORTRAN) recently got upgraded from zfp 0.5.5 to 1.0.0 and there were no problems at all, no API breakage at all. So I wouldn't expect ZfpCompression.jl to have any issues either. But anyway, will report any potential problems with 1.0.0.

jvo203 commented 1 year ago

I've just remembered the one API break that wasn't really a "show-stopper". Below is a release note from ZFP:

This release is not ABI compatible with prior releases due to numerous changes
to function signatures and data structures like zfp_field. However, few of
the API changes, other than to the cfp C API for compressed arrays, should
impact existing code. Note that numerous header files have been renamed or
moved relative to prior versions.

Specifically, all calls to zfp_field_{1,2,3,4}d should be updated. Prior to 1.0.0 the dimensions nx, ny, nz, nw etc. passed to zfp_field_* were of type uint. In ZFP 1.0.0 they are now size_t. However, this is not a "hard" breaking change as passing by value uint in place of size_t does not cause a problem in C.

[zfp_field](https://zfp.readthedocs.io/en/release1.0.0/high-level-api.html#c.zfp_field)* zfp_field_4d(void* pointer, [zfp_type](https://zfp.readthedocs.io/en/release1.0.0/high-level-api.html#c.zfp_type) type, size_t nx, size_t ny, size_t nz, size_t nw)

Still, ideally the following code and all the other 1D, 2D, 3D calls should probably be updated to use Csize_t instead of Cuint:

"""Pass a 4-D array into a zfp_field in C, in Julia only as Ptr{Cvoid}."""
function zfp_field(A::Array{T,4}) where T
    nx,ny,nz,nw = size(A)
    field = ccall((:zfp_field_4d,libzfp),Ptr{Cvoid},
        (Ptr{Cvoid},Cint,Cuint,Cuint,Cuint,Cuint),A,zfp_type(T),nx,ny,nz,nw)
    sx,sy,sz,sw = strides(A)
    ccall((:zfp_field_set_stride_4d, libzfp), Cvoid, (Ptr{Cvoid}, Cint, Cint, Cint, Cint),
        field, sx, sy, sz, sw)
    return field
end

jvo203 commented 1 year ago

Another related change: strides are now pointer differences ptrdiff_t (64-bit signed integer defined in <stddef.h>?) instead of int.

void zfp_field_set_stride_4d([zfp_field](https://zfp.readthedocs.io/en/release1.0.0/high-level-api.html#c.zfp_field)* field, ptrdiff_t sx, ptrdiff_t sy, ptrdiff_t sz, ptrdiff_t sw)
Specify strides for 4D array: sx = &a[0][0][0][1] - &a[0][0][0][0]; sy = &a[0][0][1][0] - &a[0][0][0][0]; sz = &a[0][1][0][0] - &a[0][0][0][0]; sw = &a[1][0][0][0] - &a[0][0][0][0].

jvo203 commented 1 year ago

from the ZFP release notes:

size_t and ptrdiff_t replace uint and int for array sizes and strides in the array classes and C/Fortran APIs.
zfp_bool replaces int as Boolean type in the C API.
bitstream_offset and bitstream_size replace size_t to ensure support for 64-bit offsets into and lengths of bit streams. Consequently, the bitstream API has changed accordingly.

Also, zfp_compress and zfp_decompress now return size_t

size_t zfp_compress([zfp_stream](https://zfp.readthedocs.io/en/release1.0.0/high-level-api.html#c.zfp_stream)* stream, const [zfp_field](https://zfp.readthedocs.io/en/release1.0.0/high-level-api.html#c.zfp_field)* field)

zfp_write_header and zfp_read_header also return size_t.

milankl commented 1 year ago

Could you do ] test ZfpCompression ? I do get an error, meaning we would need to release an update here to prevent this (as the upgrade to 1.0.0 therefore currently fails)

Testing Running tests...
Lossless 1-4D for all types: Error During Test at /Users/milan/.julia/packages/ZfpCompression/ark3w/test/runtests.jl:4
  Got exception outside of a @test
  MethodError: no method matching Array{Float32, 3}(::UndefInitializer, ::UInt32, ::UInt32)

Stacktrace:
   [1] zfp_decompress(src::Vector{UInt8})
     @ ZfpCompression ~/.julia/packages/ZfpCompression/ark3w/src/zfp.jl:374

milankl commented 1 year ago

The error originates here https://github.com/milankl/ZfpCompression.jl/blob/4bbefa2ef5e06c44a4397d3a8cd9da4d16c00412/src/zfp.jl#L367-L374

ndims is correctly obtained, but ZF.nx,ZF.ny,ZF.nz,ZF.nw seems to have at least 2 zeros, even for 3D and 4D arrays. Meaning that decompression is currently only possible with 1D and 2D, but not for 3/4D. Do you see immediately what's wrong here?

jvo203 commented 1 year ago

So far I've used it for the 1D case, the 2D case got compressed in Julia too but I haven't got round to decompressing it in JavaScript / WebAssembly C on the client side (got so far as the 1D case). The 2D decompression on the client side will come next week.

A quick thought: since the nx, ny, nz, nw are now size_t in the C API instead of uint, Julia's ZfpCompression.jl is still passing 32-bit Cint instead of 64-bit Csize_t. Perhaps those 2 zeros you are seeing are somewhat related to the two "missing" bytes (64 bits versus 32 bits).

jvo203 commented 1 year ago

Update: am too seeing test errors in "Lossless 1-4D" on Intel macOS. Will try the Apple Silicon macOS tomorrow. But definitely it looks like ZfpCompression.jl needs to be adjusted to match the changed C data types in the ZFP 1.0.0 C API. That's the only thing that's changed between 0.5.5 and 1.0.0.

jvo203 commented 1 year ago

After forking ZfpCompression.jl all the tests have passed with flying colors on the M1 Mac Studio:

chris@studio test % julia localtests.jl
Test Summary:               | Pass  Total  Time
Lossless 1-4D for all types |   14     14  0.7s
Test Summary:                             | Pass  Total  Time
Max abs error is bound in 1-4D for floats |   32     32  1.0s

The only change I made was to replace Cuint with Csize_t inside the ZfpField structure (fields nx, ny, nz, nw), file zfp.jl:

# READ IN ARRAYS
struct ZfpField
    type::ZfpType
    nx::Csize_t
    ny::Csize_t
    nz::Csize_t
    nw::Csize_t
    sx::Cint
    sy::Cint
    sz::Cint
    sw::Cint
    data::Ptr{Cvoid}
end

For completeness something should be done about the strides too, they are now 64-bit (I think) ptrdiff_t instead of 32-bit int.

EDIT1: it appears the strides sx, sy, sz, sw from the structure ZfpField are never actually accessed / used by zfp.jl. Strides seem to be passed directly to the ZFP C API functions without reading the field structure values. On the other hand, the dimensions nx, ny, nz, nw are being used by calling ZF.nx, ....

EDIT2: Since the ZfpField structure is being loaded directly from the C memory region ZfpField(field::Ptr) = unsafe_load(Ptr{ZfpField}(field)), even if the strides are not being accessed they should still be set to the correct byte width matching the C ptrdiff_t, otherwise strange behaviour might result...

EDIT3: Right now the C structure

typedef struct {
  zfp_type type;            // scalar type (e.g., int32, double)
  size_t nx, ny, nz, nw;    // sizes (zero for unused dimensions)
  ptrdiff_t sx, sy, sz, sw; // strides (zero for contiguous array a[nw][nz][ny][nx])
  void* data;               // pointer to array data
} zfp_field;

occupies more memory than the Julia ZfpField so there is no risk of out-of-bounds memory accesses. But still, incorrect values are being loaded into the strides, even if those incorrect strides from ZfpField are never actually being used. So the Julia ZFP bindings work fine.

jvo203 commented 1 year ago

For completeness / correctness I've just tested Cptrdiff_t and it works:

# READ IN ARRAYS
struct ZfpField
    type::ZfpType
    nx::Csize_t
    ny::Csize_t
    nz::Csize_t
    nw::Csize_t
    sx::Cptrdiff_t
    sy::Cptrdiff_t
    sz::Cptrdiff_t
    sw::Cptrdiff_t
    data::Ptr{Cvoid}
end

chris@studio test % julia localtests.jl
Test Summary:               | Pass  Total  Time
Lossless 1-4D for all types |   14     14  0.7s
Test Summary:                             | Pass  Total  Time
Max abs error is bound in 1-4D for floats |   32     32  1.0s

milankl commented 1 year ago

Awesome many thanks! You mind putting that into a pr? I guess best would be to release it as a patch so 0.2.2 or alternatively we constraint zfp_jll to 0.5.5 with a patch and then zfp 1.0.0 gets 0.3

giordano commented 1 year ago

I guess best would be to release it as a patch so 0.2.2 or alternatively we constraint zfp_jll to 0.5.5 with a patch and then zfp 1.0.0 gets 0.3?

Might be easier to add compat bounds with 0.5 in the registry, and then the PR here would require zfp_jll 1, no need to make it a breaking change if user-facing interface is unchanged

giordano commented 1 year ago

Might be easier to add compat bounds with 0.5 in the registry

https://github.com/JuliaRegistries/General/pull/86209

jvo203 commented 1 year ago

OK. I'll make a PR soon so that a new ZfpCompression.jl 0.3 can be released that takes advantage of zfp_jll 1.0.0.

ZFP 1.0.0 fixes a bug present in all prior versions of ZFP so it makes sense to move on from 0.5.5.

jvo203 commented 1 year ago

Hi, would you be making a new release anytime soon? Last week I submitted a pull request (see https://github.com/milankl/ZfpCompression.jl/pull/20).

As it stands ZfpCompression.jl does not work on Apple Silicon without making a new release.

milankl commented 1 year ago

Yes, absolutely. Sorry, I'm on travels. I'd make it a patch release though, otherwise everyone who installs 0.2 will face the same issue.

jvo203 commented 1 year ago

No problems, as long as at some point folks running Apple Silicon can use the new ZfpCompression.jl on their machines (patched or newly-released)!

milankl / ZfpCompression.jl

libzfp cannot be found on Apple Silicon macOS #19