eza-community / eza

A modern alternative to ls
https://eza.rocks
European Union Public License 1.2
11.52k stars 208 forks source link

bug: `--git` fails to mark ignored files when a lone `*` is used in `.gitignore` #521

Open cyqsimon opened 12 months ago

cyqsimon commented 12 months ago

Summary

When a lone * is specified in .gitignore, eza -al --git fails to mark ignored files as I.

Steps to reproduce

Setup:

mkdir eza-git-test
cd eza-git-test
git init
echo -e '*\n!.gitignore\n!foo.txt' > .gitignore
touch foo.txt bar.log

Then:

git status
# notice that bar.log is ignored
eza -al --git
# notice that bar.log is not marked as I

Versions

eza: v0.14.1, latest main (6a4f911) OS: EndeavourOS (Linux 6.5.6-arch2-1 x86_64)

PThorpe92 commented 11 months ago

Very odd because this information is fetched from libgit2 if I'm not mistaken. We definitely don't just parse the .gitignore file to apply the rules. I'm curious if this is the case with other projects that rely on this in the same way

cyqsimon commented 11 months ago

Okay so I took some time and dug deep into the code. TLDR I do agree with @PThorpe92's assessment that it's a libgit2 issue.

The statuses are obtained from libgit2 here:

https://github.com/eza-community/eza/blob/9805dd587936ba9e8e61d216b5f96b51b4676a24/src/fs/feature/git.rs#L218-L244

When git2::Repository::statuses is called with None options, bar.log in my example is not listed in the returned statuses. I played around with various git2::StatusOptions settings, and it turns out if you set recurse_untracked_dirs, bar.log does correctly receive its ignored status. However this will change other behaviour too, as detailed in libgit2 documentation. So for example target/ will no longer get its ignored status (but its children will). No other options I set seem to have any relevant effect.

All this is to say this is most likely worthy of submitting a bug report to upstream.

PThorpe92 commented 11 months ago

I would agree that it is, do you intend on submitting the issue? I"m sure it's a mailing list type deal for libgit2. but there is also the Rust bindings crate itself

cyqsimon commented 11 months ago

I think it's best to notify git2-rs's primary maintainer first. @ehuss what do you think about this issue?

LunarLambda commented 8 months ago

Ran into the same thing just now, with a /* pattern. Worth noting it correctly marks directories as ignored, but not the top-level files.

LunarLambda commented 1 week ago

Okay so I took some time and dug deep into the code. TLDR I do agree with @PThorpe92's assessment that it's a libgit2 issue.

The statuses are obtained from libgit2 here:

https://github.com/eza-community/eza/blob/9805dd587936ba9e8e61d216b5f96b51b4676a24/src/fs/feature/git.rs#L218-L244

When git2::Repository::statuses is called with None options, bar.log in my example is not listed in the returned statuses. I played around with various git2::StatusOptions settings, and it turns out if you set recurse_untracked_dirs, bar.log does correctly receive its ignored status. However this will change other behaviour too, as detailed in libgit2 documentation. So for example target/ will no longer get its ignored status (but its children will). No other options I set seem to have any relevant effect.

All this is to say this is most likely worthy of submitting a bug report to upstream.

I've been looking at this today and that doesn't really make a ton of sense to me.

Passing None means git_status_list_new defaults to these flags, which includes recurse_untracked_dirs.

Apparently, the real issue is that * marks . (the directory itself) as ignored, causing entries to not be listed, unless recurse_ignored_dirs is set to true.

Helpfully, the git2 docs for this option state


    /**
     * Indicates that the contents of ignored directories should be included
     * in the status. This is like doing `git ls-files -o -i --exclude-standard`
     * with core git.
     */

Which does seem like exactly what we would want. Though it is a little bit bothersome having to recurse over all files in the tree just to get a correct top-level listing.

LunarLambda commented 1 week ago

At least we can say now that this is neither a bug in git2-rs, nor a bug in eza, but (if it is a bug at all, and not an odd feature), a bug in libgit2 itself.

EDIT: Filed https://github.com/libgit2/libgit2/issues/6890 Will see what comes out

cafkafk commented 1 week ago

Passing None means git_status_list_new defaults to these flags, which includes recurse_untracked_dirs.

This is in repo.statuses(None)? Perhaps we should specify StatusOptions with include_untracked and include_ignored to get default behavior minus recurse_untraced_dirs?

But I'm not sure I understand the total implications of doing that to be honest, I wonder if there is a reasonable usecase that would break in exchange (i.e., if we do this, let's consider marking it breaking and/or being careful).


Ohh, and thanks for diagnosing this issue, much appreciated :heart:

LunarLambda commented 1 week ago

recurse_untracked_dirs has no effect on the ignore issue.

Note that there's both recurse_UNTRACKED_dirs, and recurse_IGNORED_dirs. the former is set by default (statuses(None)), the later is not. Setting the latter mitigates the issue of top-level ignored files not being treated properly, but also means much more entries to parse.

Since the default behaviour is quite strange (correctly handling ignored directories, but not files), I filed an issue on the libgit2 repo.

Adding recurse_ignored_dirs would fix the issue at the cost of performance (both memory and runtime), but I'd be curious to see what the libgit2 people say also.

LunarLambda commented 1 week ago

A tiny reproducer for this is something like

cargo new git2-test
cd git2-test
cargo add git2
touch foo
fn main() {
     let dir = std::env::args().nth(1).unwrap_or(".".into());

     let mut opts = git2::StatusOptions::new();

     // default git status options, like `statuses(None)`
     opts.include_untracked(true).include_ignored(true).recurse_untracked_dirs(true);

     // [un]comment/test with
     opts.recurse_ignored_dirs(true);

     let repo = git2::Repository::open(&dir).unwrap();

     for entry in repo.statuses(Some(&mut opts)).unwrap().iter() {
         println!("{:?} {:?}", entry.path(), entry.status());
     }
}

With the recurse_ignore_dirs line commented out, and .gitignore:

/target
foo

Works properly (file foo will be listed as ignored)

/target
*

Doesn't work (only directories src/ and target/ will be listed)

With recurse_ignore_dirs enabled, foo gets listed as ignored, and so does everything in target/ (= a lot of files)