Closed jamesrhester closed 2 years ago
Great. Could you add some tests? https://github.com/JuliaIO/HDF5.jl/blob/master/test/filter.jl
Will do.
I think you are missing an implementation of bshuf_h5_set_local
function.
https://github.com/kiyo-masui/bitshuffle/blob/master/src/bshuf_h5filter.c#L29
Yes, working on it now. Having to write tests meant I discovered that I'd ignored the writing part of the filter...
See https://github.com/JuliaIO/HDF5.jl/blob/ccdb3e90a5910bee954d066de7b3f775f9067bc3/filters/H5Zblosc/src/H5Zblosc.jl#L19 if you need an example of that.
Also, we could just compile the C code for the plugin, probably as a separate Yggrassil recipe. We could then just use the H5PL
interface to tell HDF5 where the plugin is.
It turns out that I put the lower compat limit of bitshuffle_jll
as 1.6, have to wait for that to be fixed.
OK, I'm informed (see https://github.com/JuliaPackaging/Yggdrasil/pull/5111) that the lower compatibility limit for bitshuffle_jll
should be 1.6. So I don't know if I should be trying to pass the Julia 1.3 checks?
These tests should only run with 1.6 then. Just use a @static if VERSION >= v"1.6"
to qualify the tests and the package import.
I have used @static
as suggested, not sure how to manage the import resolution fail for the 1.3 tests.
I'm happy to merge this if you are. Thanks again.
Yes, I'm happy to merge. Thanks for your review and help.
I'm pausing for a few days to give others the chance to review. I'm also going to do some further testing with plugins from other sources and via different APIs.
Can the license correctly be categorized as a MIT license?
Yes, it was meant to be a MIT license copied from the original bitshuffle distribution. Thanks for merging this.
@jamesrhester , I'm seeing a sporadic CI failure on 64-bit Windows: https://github.com/JuliaIO/HDF5.jl/runs/7292646756?check_suite_focus=true
We may need to review
Looks like nightly is OK, but Julia 1.7 has a problem. I'll see if I can spin up a Windows 64 VM here and run the test repeatedly.
I'm suspecting it could be because of a lack of @GC.preserve
at the end of the filter function with the unsafe_load
and unsafe_store!
calls. The corresponding H5Zblosc filter doesn't have these, however, so once I've got a test VM running I'll try and duplicate the error and then see if the @GC.preserve
fixes it.
The original code allocates 11 Cuints
for values, but you only allocate 8 (similar to tmp_values
) in bitshuffle_set_local
. You also don't seem to move the values to a higher position. I think you may be overwriting values stored in bs_values
by API.h5p_get_filter_by_id
. Perhaps I am missing something?
You also do not bump nelements by 3.
I see now that you added three extra values to the struct similar to how we implemented the Blosc filter.
Hmm, you never initialize bs_values
to 0
.
The way this works, as I understand it, is that the set_local
routine provides an opportunity to fill and check values in the BitshuffleFilter
struct, which is what bs_values
is filled with after H5API.h5p_get_filter_by_id
. It would also be OK to completely skip this step and set everything at BitshuffleFilter initialisation, except that the element size isn't known at the time that the filter is constructed. So in the Julia version I have set almost every value at initialisation, except for element size, but repeat the setting of MAJOR, MINOR values.
Still sorting out access to a Windows VM.
One issue I encountered that relates to GC.@preserve
and decompression is detailed here:
https://github.com/JuliaLang/julia/issues/43402
This was due a specific change on the development branch of Julia though.
What happens if bs_values
is getting freed after bitshuffle_set_local
but before H5Z_filter_bitshuffle
finishes?
I have a Windows box, and so far I am not able to reproduce the error reliably.
I'm going to try to focus testing here: https://github.com/JuliaIO/HDF5.jl/pull/988
One issue I encountered that relates to
GC.@preserve
and decompression is detailed here: JuliaLang/julia#43402This was due a specific change on the development branch of Julia though.
I read through that but couldn't see how it applies here. Weird bug though.
What happens if
bs_values
is getting freed afterbitshuffle_set_local
but beforeH5Z_filter_bitshuffle
finishes?
Well, as bs_values
is of type Ptr{Cuint}
it should not get freed by Julia if I understand the whole Ptr
, Ref
thing.
Well, as
bs_values
is of typePtr{Cuint}
it should not get freed by Julia if I understand the wholePtr
,Ref
thing.
The problem is that you allocate it via bs_values = Vector{Cuint}(undef,8)
. When it gets passed to ccall
, the pointer behind the Vector
is passed to HDF5. After bitshuffle_set_local
completes, Julia thinks it can free the pointer behind bs_values
.
The question is if H5Pmodify_filter
just hangs on to the pointer or if it makes a copy.
H5Pmodify_filter
does seem to copy the data:
https://github.com/HDFGroup/hdf5/blob/c064d3481b582653c1e0d0043a17527fd73e8c4d/src/H5Z.c#L1095-L1096
Aha, I found something.
The C function signature is:
int64_t bshuf_decompress_lz4(const void* in, void* out, const size_t size,
const size_t elem_size, size_t block_size) {
The ccall
however is
err = ccall((:bshuf_decompress_lz4,libbitshuffle),Cint,
(Ptr{Cvoid},Ptr{Cvoid},Cuint,Cuint,Cuint),
in_buf,out_buf,size,elem_size,block_size)
This should be
err = ccall((:bshuf_decompress_lz4,libbitshuffle),Int64,
(Ptr{Cvoid},Ptr{Cvoid},Csize_t,Csize_t,Csize_t),
in_buf,out_buf,size,elem_size,block_size)
julia> Csize_t
UInt64
julia> Cuint
UInt32
This adds the bitshuffle filter for HDF5. Closes #981.