rust-lang / git2-rs

libgit2 bindings for Rust
https://docs.rs/git2
Apache License 2.0
1.67k stars 384 forks source link

Diff aborts with `core.autocrlf` and an emptied file #900

Closed mohamed-abdelnour closed 1 year ago

mohamed-abdelnour commented 1 year ago

Hello, everyone!

git2-rs currently aborts when:

Trying to put that into a sentence was confusing, hopefully that made any sense.

I haven't looked into the actual cause of the problem just yet, but I have been able to reliably reproduce it (I hope).

Steps to reproduce for x86_64-unknown-linux-gnu

Here is a minimal example:

use std::{error::Error, fs::File, io::Write, path::Path};

use git2::Repository;

// Reproduce the abort inside `dir`, where `dir` is a directory.
pub(crate) fn minimal(dir: impl AsRef<Path>) -> Result<(), Box<dyn Error>> {
    const FILE_NAME: &str = "file";

    let dir = dir.as_ref();

    // Initialise a Git repository.
    let repo = Repository::init(dir)?;

    // Set `core.autocrlf`.
    repo.config()?.set_bool("core.autocrlf", true)?;
    // OR
    // repo.config()?.set_str("core.autocrlf", "input")?;

    // Create a non-empty file and stage it.
    let relative_path = Path::new(FILE_NAME);
    let mut file = File::create(dir.join(relative_path))?;
    file.write_all(b"filler\n")?;

    let mut index = repo.index()?;

    index.add_path(relative_path)?;
    index.write()?;

    // Empty the file.
    file.set_len(0)?;

    let diff = repo.diff_index_to_workdir(None, None)?;

    // Diff => aborts.
    diff.foreach(&mut |_, _| true, None, None, None)
        .map_err(Into::into)
}

For convenience, I added a binary crate that calls this function with a temporary directory, link: https://github.com/mohamed-abdelnour/git2-rs/tree/abort_demo/.

Using that:

Expected result

The expected output of the program is:

free(): invalid pointer

It may be followed by something along the lines of: Aborted (core dumped).

The program raises signal 6 (SIGABRT) which most shells indicate by exiting with code 134 (128 + 6).

Tracing

I've also included sample logs generated with a few tools to trace the abort, feel free to check them out here: https://github.com/mohamed-abdelnour/git2-rs/tree/abort_demo/sample_logs.

For example, running AddressSanitizer with rustup installed:

RUSTFLAGS=-Zsanitizer=address cargo +nightly run --features=contained --target=x86_64-unknown-linux-gnu

Which should produce an output similar to: https://github.com/mohamed-abdelnour/git2-rs/blob/309ac9e76cfbb55bae03ad330ae6698592fe2009/sample_logs/asan.log.

ehuss commented 1 year ago

Thanks for the report! I have reported this upstream in https://github.com/libgit2/libgit2/issues/6431.

mohamed-abdelnour commented 1 year ago

Thank you!

Something to note: I can't reproduce the problem with the snippet you posted in libgit2/libgit2#6431 in a clean environment.

For example, with the snippet in main.c:

clang -lgit2 -o main main.c && env -i ./main

This doesn't abort and exits successfully.

With the following patch in diff.patch:

diff --git a/main.c b/main.c
index 8b88dd1..9de49cc 100644
--- a/main.c
+++ b/main.c
@@ -14,6 +14,10 @@ int main() {
   git_repository *repo;
   assert(git_repository_init(&repo, "repo", 0) == 0);

+  git_config *config;
+  assert(git_repository_config(&config, repo) == 0);
+  assert(git_config_set_bool(config, "core.autocrlf", 1) == 0);
+
   FILE *fp;
   fp = fopen("repo/file", "w");
   fputs("this is a test\n", fp);
patch -p1 <diff.patch
clang -lgit2 -o main main.c && env -i ./main

This aborts as expected.

I'm just pointing this out in case someone doesn't have core.autocrlf set globally and can't reproduce the problem.

ehuss commented 1 year ago

Hm, that's strange. I tested with and without it, and didn't notice a difference. I don't have core.autocrlf configured (I removed all my git configuration to test). The diff code is pretty complex, but I think it requires some kind of "filter" to be activated. I'm not sure what is different, but I'll update the issue.

mohamed-abdelnour commented 1 year ago

Thank you for updating the issue!

That is strange. I don't know anything about the code of libgit2, so I'm also not sure what's different.

I don't want to take much more of your time on this (especially that it's a problem upstream), but I thought I'd try to minimise the setup we're doing with libgit2 and see if it makes the results more consistent.

If you want to try again on your end:

Makefile

main: main.c
    cc -l git2 -o main main.c

main.c

#include <assert.h>
#include <git2/diff.h>
#include <git2/global.h>
#include <git2/repository.h>

int main() {
  assert(git_libgit2_init() == 1);

  git_repository *repo;
  assert(git_repository_open(&repo, "repo") == 0);

  git_diff *diff;
  assert(git_diff_index_to_workdir(&diff, repo, NULL, NULL) == 0);

  assert(git_diff_foreach(diff, NULL, NULL, NULL, NULL, NULL) == 0);

  return 0;
}

abort_demo (executable)

#!/usr/bin/env sh

REPO='repo'
FILE='file'

run() {
    stderr=$(./main 2>&1)
    printf '{"case":"%s","exit_code":%u,"stderr":"%s"}\n' "$1" $? "$stderr"
}

# shellcheck disable=2064
trap "rm -rf $REPO" EXIT

git init --quiet "$REPO" && (
    cd "$REPO" && {
        printf 'this is a test\n' >"$FILE"
        git add "$FILE"

        : >"$FILE"
    }
)

# shellcheck disable=2016
make --quiet && {
    run 'without setting `core.autocrlf`'

    (cd "$REPO" && git config --local core.autocrlf true)
    run 'setting `core.autocrlf` to `true`'
}

For me, running

env --ignore-environment PATH="$PATH" ./abort_demo | jq --slurp

(piping through jq --slurp being optional), gives:

[
  {
    "case": "without setting `core.autocrlf`",
    "exit_code": 0,
    "stderr": ""
  },
  {
    "case": "setting `core.autocrlf` to `true`",
    "exit_code": 134,
    "stderr": "free(): invalid pointer"
  }
]
mohamed-abdelnour commented 1 year ago

I just noticed that this has been fixed upstream as of libgit2/libgit2@b755b7d294780bf7dc1c9b5c5fff892ac5f0ec38 (earliest contained in v1.6.1).

git2-rs includes the fix starting v0.17.0.

Thank you all for your work!