JuliaIO / HDF5.jl

Save and load data in the HDF5 file format from Julia
https://juliaio.github.io/HDF5.jl
MIT License
383 stars 139 forks source link

Make libhdf5 and libhdf5_hl const globals #1097

Closed mkitti closed 1 year ago

mkitti commented 1 year ago

libhdf5 and libhdf5_hl must be const globals.

mkitti commented 1 year ago

Without making these const, I noticed that we end up with a dynamic symbol lookup from @code_llvm HDF5.API.h5_open() from our ccalls.

try:                                              ; preds = %L18
;  @ C:\Users\kittisopikulm\.julia\dev\HDF5\src\api\functions.jl:118 within `h5_open`
  %40 = load atomic void ()*, void ()** @libname_H5open_428 unordered, align 8
  %.not28 = icmp eq void ()* %40, null
  br i1 %.not28, label %dlsym, label %ccall

dlsym:                                            ; preds = %try
  %41 = load atomic {}*, {}** inttoptr (i64 140711346990224 to {}**) unordered, align 16
  %42 = getelementptr inbounds [3 x {}*], [3 x {}*]* %gcframe30, i64 0, i64 2
  store {}* %41, {}** %42, align 16
  %43 = call void ()* @ijl_lazy_load_and_lookup({}* %41, i8* getelementptr inbounds ([7 x i8], [7 x i8]* @_j_str1, i64 0, i64 0))
  store atomic void ()* %43, void ()** @libname_H5open_428 release, align 8
  br label %ccall

ccall:                                            ; preds = %dlsym, %try
  %44 = phi void ()* [ %40, %try ], [ %43, %dlsym ]
  %45 = bitcast void ()* %44 to i32 ()*
  %46 = call i32 %45()
  store volatile i8 1, i8* %phic, align 1
  store volatile i32 %46, i32* %phic1, align 4
  call void @ijl_pop_handler(i32 1)
  call void @llvm.lifetime.end.p0i8(i64 320, i8* nonnull %.sub31)
  br label %L31

err:                                              ; preds = %L66
;  @ C:\Users\kittisopikulm\.julia\dev\HDF5\src\api\functions.jl:120 within `h5_open`
  call void @ijl_undefined_var_error({}* inttoptr (i64 1705555475792 to {}*))
  unreachable

After this change, the call out is much simpler:

try:                                              ; preds = %L18
;  @ C:\Users\kittisopikulm\.julia\dev\HDF5\src\api\functions.jl:118 within `h5_open`
  %40 = call i32 inttoptr (i64 140710727853568 to i32 ()*)()
  store volatile i8 1, i8* %phic, align 1
  store volatile i32 %40, i32* %phic1, align 4
  call void @ijl_pop_handler(i32 1)
  call void @llvm.lifetime.end.p0i8(i64 320, i8* nonnull %.sub30)
  br label %L31

err:                                              ; preds = %L66
;  @ C:\Users\kittisopikulm\.julia\dev\HDF5\src\api\functions.jl:120 within `h5_open`
  call void @ijl_undefined_var_error({}* inttoptr (i64 2777687145808 to {}*))
  unreachable

The 140710727853568 above is the function pointer for H5open.

julia> Libdl.dlsym(HDF5_jll.libhdf5_handle, :H5open) |> Int
140710727853568
mkitti commented 1 year ago

Here is a reduced MWE:

julia> using HDF5_jll

julia> const libhdf5_const = HDF5_jll.libhdf5
"C:\\Users\\kittisopikulm\\.julia\\artifacts\\5722a7cd9c1d2e64d4628db79b5e0eab97f81e88\\bin\\libhdf5-310.dll"

julia> f() = @ccall libhdf5.H5open()::Int
f (generic function with 1 method)

julia> g() = @ccall libhdf5_const.H5open()::Int
g (generic function with 1 method)

julia> @code_llvm f()
;  @ REPL[13]:1 within `f`
; Function Attrs: uwtable
define i64 @julia_f_322() #0 {
top:
  %gcframe2 = alloca [3 x {}*], align 16
  %gcframe2.sub = getelementptr inbounds [3 x {}*], [3 x {}*]* %gcframe2, i64 0, i64 0
  %0 = bitcast [3 x {}*]* %gcframe2 to i8*
  call void @llvm.memset.p0i8.i32(i8* noundef nonnull align 16 dereferenceable(24) %0, i8 0, i32 24, i1 false)
  %1 = call {}*** inttoptr (i64 140710433750464 to {}*** ()*)() #4
  %2 = bitcast [3 x {}*]* %gcframe2 to i64*
  store i64 4, i64* %2, align 16
  %3 = getelementptr inbounds [3 x {}*], [3 x {}*]* %gcframe2, i64 0, i64 1
  %4 = bitcast {}** %3 to {}***
  %5 = load {}**, {}*** %1, align 8
  store {}** %5, {}*** %4, align 8
  %6 = bitcast {}*** %1 to {}***
  store {}** %gcframe2.sub, {}*** %6, align 8
  %7 = load atomic void ()*, void ()** @libname_H5open_324 unordered, align 8
  %.not = icmp eq void ()* %7, null
  br i1 %.not, label %dlsym, label %ccall

dlsym:                                            ; preds = %top
  %8 = load atomic {}*, {}** inttoptr (i64 140711346990224 to {}**) unordered, align 16
  %9 = getelementptr inbounds [3 x {}*], [3 x {}*]* %gcframe2, i64 0, i64 2
  store {}* %8, {}** %9, align 16
  %10 = call void ()* @ijl_lazy_load_and_lookup({}* %8, i8* getelementptr inbounds ([7 x i8], [7 x i8]* @_j_str1, i64 0, i64 0))
  store atomic void ()* %10, void ()** @libname_H5open_324 release, align 8
  br label %ccall

ccall:                                            ; preds = %dlsym, %top
  %11 = phi void ()* [ %7, %top ], [ %10, %dlsym ]
  %12 = bitcast void ()* %11 to i64 ()*
  %13 = call i64 %12()
  %14 = load {}*, {}** %3, align 8
  %15 = bitcast {}*** %1 to {}**
  store {}* %14, {}** %15, align 8
  ret i64 %13
}

julia> @code_llvm g()
;  @ REPL[14]:1 within `g`
; Function Attrs: uwtable
define i64 @julia_g_325() #0 {
top:
  %0 = call i64 inttoptr (i64 140710727853568 to i64 ()*)()
  ret i64 %0
}
mkitti commented 1 year ago

After discussing this in Slack, I've been told that this is a bad idea because in general this will mess up PackageCompiler.jl.

cc: @simonbyrne @musm

musm commented 1 year ago

sounds good. Would const global refs e.g. const x = Ref() help?

mkitti commented 1 year ago

One would need to test this, but I think this is effectively the same as the current situation post Julia 1.6. The JLLs exported a typed global.