Closed derobins closed 1 month ago
Sample test failure output:
Test Summary: | Pass Fail Broken Total
HDF5.jl | 1497 2 3 1502
plain | 151 1 152
complex | 13 13
undefined and null | 4 4
abstract arrays | 2 2
empty and 0-size arrays | 39 39
generic read of native types | 17 17
show | 44 44
split1 | 13 13
haskey | 18 18
AbstractString | 51 51
opaque data | 7 7
FixedStrings and FixedArrays | 18 18
Object Exists | 8 8
HDF5 existance | 4 4
bounds | 2 2
create_dataset | 264 264
Strings | 8 8
h5a_iterate | 7 1 8
h5l_iterate | 7 1 8
h5dchunk_iter | 3 3
compound | 10 10
create_dataset (compound) | 4 4
write_compound | 27 27
custom | 6 6
reference | 6 6
null dataspace | 13 13
scalar dataspace | 15 15
simple dataspaces | 98 98
BlockRange | 42 42
hyperslab | 6 6
Datatypes | 15 15
hyperslab | 5 5
read 0-length arrays: issue #859 | No tests
attrs interface | 92 92
variable length strings | 1 1
readremote | 23 23
extend | 29 29
gc | 101 101
external | 6 6
swmr | 4 4
mmap | 9 9
properties | 46 1 47
filter | 80 80
Raw Chunk I/O | 80 80
fileio | 6 6
track order | 18 18
h5f_get_dset_no_attrs_hint | 6 6
non-allocating methods | 11 1 12
Compression Filter Unit Tests | 6 6
Object API | 38 38
virtual dataset | 5 5
mpio | 1 1
ERROR: LoadError: Some tests did not pass: 1[497](https://github.com/HDFGroup/hdf5/actions/runs/9333687611/job/25700081685?pr=4538#step:11:500) passed, 2 failed, 0 errored, 3 broken.
in expression starting at /home/runner/work/hdf5/hdf5/test/runtests.jl:34
ERROR: LoadError: Package HDF5 errored during testing
Stacktrace:
[1] pkgerror(msg::String)
@ Pkg.Types /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Types.jl:55
[2] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, julia_args::Cmd, test_args::Cmd, test_fn::Nothing)
@ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:1712
[3] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, test_fn::Nothing, julia_args::Vector{String}, test_args::Cmd, kwargs::Base.Iterators.Pairs{Symbol, IOContext{Base.PipeEndpoint}, Tuple{Symbol}, NamedTuple{(:io,), Tuple{IOContext{Base.PipeEndpoint}}}})
@ Pkg.API /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:343
[4] test(pkgs::Vector{Pkg.Types.PackageSpec}; io::IOContext{Base.PipeEndpoint}, kwargs::Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:coverage, :julia_args), Tuple{Bool, Vector{String}}}})
@ Pkg.API /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:80
[5] test(; name::Nothing, uuid::Nothing, version::Nothing, url::Nothing, rev::Nothing, path::Nothing, mode::Pkg.Types.PackageMode, subdir::Nothing, kwargs::Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:coverage, :julia_args), Tuple{Bool, Vector{String}}}})
@ Pkg.API /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:96
[6] top-level scope
@ ~/work/_actions/julia-actions/julia-runtest/latest/test_harness.jl:15
[7] include(fname::String)
@ Base.MainInclude ./client.jl:444
[8] top-level scope
@ none:1
in expression starting at /home/runner/work/_actions/julia-actions/julia-runtest/latest/test_harness.jl:7
Error: Process completed with exit code 1.
@mkitti - Any ideas?
Could you point me to the CI output?
These both point to issues with the callback mechanism for the iteration functions. I'm not sure which exact test is failing yet though.
Incidentally, we also seem to be having some issues with Windows builds lately: https://github.com/JuliaPackaging/Yggdrasil/pull/8588
Error output (Autotools) here:
https://github.com/HDFGroup/hdf5/actions/runs/9333687611/job/25707113092
Any recent test failure in HDF5 will likely be a Julia failure.
Could you point me to the CI output?
These both point to issues with the callback mechanism for the iteration functions. I'm not sure which exact test is failing yet though.
Yeah, with the randomness of the error, my guess is that there is some uninitialized memory usage someplace. Maybe -fsanitize=memory on clang would help.
Yes, I'm noticing the randomness as well. The issue appears to involve an error being thrown within the Julia callback function. The error gets caught by a Julia try-catch and the callback returns -1
.
The problem is that after iteration stops, we are not receiving the error code upon return of H5Aiterate2
.
The CI test that is failing checks to see that an error is received when the callback throws an error. The test fails because the error is not detected.
The Julia error reference itself is returned via opdata.
I've preparing to disable the affected tests here: https://github.com/JuliaIO/HDF5.jl/pull/1155
I will merge shortly.
I have a successful CI run here: https://github.com/JuliaIO/HDF5.jl/actions/runs/9341332687/attempts/1
I'm running it one more time before I merge to make sure that there are no stochastic error nows.
The Julia GitHub CI actions have been broken for the past week or two, both in Autotools and CMake. There were no obvious changes that could have caused these failures. They will often pass when re-run.
We will need to investigate why they are failing. Since it's random, it may be a memory issue, either in the Julia wrappers or the HDF5 library.