bhftbootcamp / LibZip.jl

LibZip is a convenient wrapper around libzip for reading, creating, and modifying ZIP and ZIP64 archives. It also supports file encryption and decryption
MIT License
4 stars 2 forks source link

Segmentation fault when reading #2

Open nhz2 opened 1 month ago

nhz2 commented 1 month ago

Describe the bug

Hello, While trying to test if this package is compatible with https://github.com/JuliaIO/ZipArchives.jl I occasionally got segmentation faults. I created a MWE below where I manually call GC.gc().

To Reproduce

using LibZip
archive = ZipArchive(; flags = LIBZIP_CREATE)
write(archive, "greetings.txt", rand(UInt8, 100000000))
zip_compress_file!(archive, "greetings.txt")
# Write the zip archive to a specified file path
my_path = tempname()
write(my_path, archive)
close(archive)

zip_data = read(my_path)
archive = ZipArchive(zip_data; flags = LIBZIP_RDONLY)
zip_data = nothing
GC.gc()
GC.gc()
GC.gc()
read(archive, "greetings.txt")
[110590] signal 11 (1): Segmentation fault
in expression starting at REPL[14]:1
unknown function (ip: 0x736a58da67f7)
read_data at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
_zip_source_call at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
zip_source_read at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
_zip_read at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
_zip_buffer_new_from_source at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
_zip_dirent_size at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
_zip_file_get_offset at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
window_read at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
_zip_source_call at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
zip_source_open at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
zip_source_open at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
zip_source_open at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
zip_fopen_index_encrypted at /home/nathan/.julia/artifacts/5ad7420957f915c91e9348ac93ca3c1674cb8b0f/lib/libzip.so (unknown line)
libzip_fopen_index at /home/nathan/.julia/packages/LibZip/uolUI/src/LibZip.jl:511 [inlined]
#read#11 at /home/nathan/.julia/packages/LibZip/uolUI/src/ZipTools.jl:473
read at /home/nathan/.julia/packages/LibZip/uolUI/src/ZipTools.jl:465 [inlined]
#read#12 at /home/nathan/.julia/packages/LibZip/uolUI/src/ZipTools.jl:486 [inlined]
read at /home/nathan/.julia/packages/LibZip/uolUI/src/ZipTools.jl:485
unknown function (ip: 0x736a5794fdd6)
jl_apply at /cache/build/builder-demeter6-6/julialang/julia-master/src/julia.h:2157 [inlined]
do_call at /cache/build/builder-demeter6-6/julialang/julia-master/src/interpreter.c:126
eval_value at /cache/build/builder-demeter6-6/julialang/julia-master/src/interpreter.c:223
eval_stmt_value at /cache/build/builder-demeter6-6/julialang/julia-master/src/interpreter.c:174 [inlined]
eval_body at /cache/build/builder-demeter6-6/julialang/julia-master/src/interpreter.c:663
jl_interpret_toplevel_thunk at /cache/build/builder-demeter6-6/julialang/julia-master/src/interpreter.c:821
jl_toplevel_eval_flex at /cache/build/builder-demeter6-6/julialang/julia-master/src/toplevel.c:943
jl_toplevel_eval_flex at /cache/build/builder-demeter6-6/julialang/julia-master/src/toplevel.c:886
jl_toplevel_eval_flex at /cache/build/builder-demeter6-6/julialang/julia-master/src/toplevel.c:886
jl_toplevel_eval_flex at /cache/build/builder-demeter6-6/julialang/julia-master/src/toplevel.c:886
ijl_toplevel_eval_in at /cache/build/builder-demeter6-6/julialang/julia-master/src/toplevel.c:994
eval at ./boot.jl:430 [inlined]
eval_user_input at /cache/build/builder-demeter6-6/julialang/julia-master/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:245
repl_backend_loop at /cache/build/builder-demeter6-6/julialang/julia-master/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:342
#start_repl_backend#59 at /cache/build/builder-demeter6-6/julialang/julia-master/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:327
start_repl_backend at /cache/build/builder-demeter6-6/julialang/julia-master/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:324
#run_repl#72 at /cache/build/builder-demeter6-6/julialang/julia-master/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:483
run_repl at /cache/build/builder-demeter6-6/julialang/julia-master/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:469
jfptr_run_repl_10088 at /home/nathan/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/compiled/v1.11/REPL/u0gqU_GYsA8.so (unknown line)
#1139 at ./client.jl:446
jfptr_YY.1139_14649 at /home/nathan/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/compiled/v1.11/REPL/u0gqU_GYsA8.so (unknown line)
jl_apply at /cache/build/builder-demeter6-6/julialang/julia-master/src/julia.h:2157 [inlined]
jl_f__call_latest at /cache/build/builder-demeter6-6/julialang/julia-master/src/builtins.c:875
#invokelatest#2 at ./essentials.jl:1055 [inlined]
invokelatest at ./essentials.jl:1052 [inlined]
run_main_repl at ./client.jl:430
repl_main at ./client.jl:567 [inlined]
_start at ./client.jl:541
jfptr__start_72144.1 at /home/nathan/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/builder-demeter6-6/julialang/julia-master/src/julia.h:2157 [inlined]
true_main at /cache/build/builder-demeter6-6/julialang/julia-master/src/jlapi.c:900
jl_repl_entrypoint at /cache/build/builder-demeter6-6/julialang/julia-master/src/jlapi.c:1059
main at /cache/build/builder-demeter6-6/julialang/julia-master/cli/loader_exe.c:58
unknown function (ip: 0x736a58c29d8f)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 7390422 (Pool: 7388804; Big: 1618); GC: 15
Segmentation fault (core dumped)

Expected behavior

A clear and concise description of what you expected to happen.

I think there is a missing call to Base.cconvert and GC.@preserve somewhere.

The functions must be used to ensure Julia doesn't free memory that a C library is still using.

Additional context

julia> versioninfo()
Julia Version 1.11.1
Commit 8f5b7ca12ad (2024-10-16 10:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 16 × AMD Ryzen 7 7800X3D 8-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver4)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores)

(jl_Yo1dQg) pkg> st
Status `/tmp/jl_Yo1dQg/Project.toml`
  [89089acc] LibZip v1.0.0
AlexKlo commented 4 weeks ago

Thanks for letting us know about this issue! We really appreciate it, especially the example you provided for reproducing the bug and your potential solution.

AlexKlo commented 2 weeks ago

@nhz2 Could you please clarify how exactly you suggest using GC.@preserve to solve this issue? My attempts didn’t yield the desired result. Since the data source is external, I was only able to protect it externally

using LibZip
archive = ZipArchive(; flags = LIBZIP_CREATE)
zip_data = rand(UInt8, 100_000_000);
GC.@preserve zip_data begin
    write(archive, "greetings.txt", zip_data)
    zip_data = nothing
    GC.gc()
end
julia> close(archive)
true
nhz2 commented 5 days ago

I think one way is to save a reference to the thing that needs to be preserved in the ZipArchive struct. Then you can GC.@preserve in any internal methods that use the data.

I've looked a bit at the docs for libzip and found https://libzip.org/documentation/zip_source_buffer.html

data must remain valid for the lifetime of the created source

But I'm not sure what exactly they mean by lifetime here. You may want to ask a question about this at https://github.com/nih-at/libzip/discussions