rust-lang / git2-rs

libgit2 bindings for Rust
https://docs.rs/git2
Apache License 2.0
1.69k stars 387 forks source link

Segfault during fetch #1057

Open ctron opened 3 months ago

ctron commented 3 months ago

I am not sure what the exact condition is, as the code works in some cases. It fails on aarch64.

[Current thread is 1 (Thread 0xffff8e7b7c00 (LWP 5))]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-100.el9_4.2.aarch64 libgcc-11.4.1-3.el9.alma.1.aarch64
(gdb) where
#0  0x0000aaaabf1dcdc4 in git_commit_list_insert_by_date ()
#1  0x0000aaaabf1925c8 in git_revwalk.push_commit ()
#2  0x0000aaaabf192b24 in git_revwalk.push_glob ()
#3  0x0000aaaabf1b82a8 in git_smart.negotiate_fetch ()
#4  0x0000aaaabf1e0510 in git_fetch_negotiate ()
#5  0x0000aaaabf187d54 in git_remote.download ()
#6  0x0000aaaabf188ff4 in git_remote_fetch ()
#7  0x0000aaaabe8f33ac in git2::remote::Remote::fetch ()
#8  0x0000aaaabe8fa1c4 in tracing::span::Span::in_scope ()
#9  0x0000aaaabe743be0 in trustify_module_importer::server::common::walker::git::GitWalker<H,T>::run_sync ()
#10 0x0000aaaabe7d61f0 in tokio::runtime::task::core::Core<T,S>::poll ()
#11 0x0000aaaabe841ca0 in tokio::runtime::task::harness::Harness<T,S>::poll ()
#12 0x0000aaaac0087514 in std::sys_common::backtrace::__rust_begin_short_backtrace ()
#13 0x0000aaaac0067af4 in core::ops::function::FnOnce::call_once{{vtable.shim}} () at library/core/src/panic.rs:106
#14 0x0000aaaac010c148 in alloc::boxed::{impl#47}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:2020
#15 alloc::boxed::{impl#47}::call_once<(), alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:2020
#16 std::sys::pal::unix::thread::{impl#2}::new::thread_start () at library/std/src/sys/pal/unix/thread.rs:108
#17 0x0000ffff8ec5e698 in start_thread () from /lib64/libc.so.6
#18 0x0000ffff8ecc8bdc in thread_start () from /lib64/libc.so.6

I assume it is using the vendored version (default build flags, compiled with cross, which doesn't have libgit2 installed afaik). The output of ldd is:

        linux-vdso.so.1 (0x0000ffffb36d2000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000ffffb3664000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000ffffb3643000)
        libm.so.6 => /lib64/libm.so.6 (0x0000ffffae95f000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000ffffae93e000)
        libc.so.6 => /lib64/libc.so.6 (0x0000ffffae780000)
        /lib/ld-linux-aarch64.so.1 (0x0000ffffb3695000)
ctron commented 3 months ago

I am using git2 0.18.3 and libgit2-sys 0.16.2+1.7.2

ehuss commented 3 months ago

Can you create a reproduction? Otherwise I don't think we'll be able to help.

ctron commented 3 months ago

I briefly tried that, but didn't succeed and went in a different direction. If that helps, I'll try again and see if I can get there. Because on that single installation, it fails 100% of the time.

ctron commented 3 months ago

Just a quick update, I am trying to narrow it down. Using the original application, it seems to work when using cargo to compile and run it. Using cross to compile it (using the similar or same Ubuntu version the cross container image should have) results in a segfault. Always at the same location.

ctron commented 3 months ago

I do have a reproducer. It's a bit complex and a bit weird. It seems to include cross and aarch64: https://github.com/ctron/git2-repro

ctron commented 3 months ago

The main difference I figured out so far is that cross actually uses an Ubuntu 16.04 based image, while I was using an Ubuntu 20.04 VM to reproduce this. cross updated the container image's base layer, but it seems like they never released that.

ctron commented 3 months ago

Since everything is vendored and static, I don't understand what flows into the final binary that makes a difference.

ctron commented 3 months ago

If I swap out the build image with the edge version (which is built on what is in main git) then works.

I don't know which parts influence the final binary to misbehave, as to my understanding everything should be vendored.

On the other side, I am not sure trying to fix an issue which is caused by an 8 year old distribution makes sense.