NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
11.95k stars 1.46k forks source link

Remove runtime dependency on `git` package #9807

Open roberth opened 7 months ago

roberth commented 7 months ago

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Use libgit2 for everything. Do not call the Git CLI from Nix. We'll still need it in the functional test suite, but that's ok. Do not regress in functionality.

Additional context

Priorities

Personally, I just opened this for tracking purposes.

Add :+1: to issues you find important.

lf- commented 7 months ago

You should be aware that libgit2 has severe performance problems, which would be rather unfortunate given that git is already not the fastest thing in the world. It seems much more sensible to deliberately put a fallback git into the closure of Nix, so it will consistently be available, and be of the better implementation.

roberth commented 7 months ago

You should be aware that libgit2 has severe performance problems

We've seen some performance regressions that we expect to be offset by the ability to operate on subtrees of repositories, and not adding everything to the store. I haven't seen anything that I'd have to call severe. @lf- Could you elaborate on that? Do you happen to know why this might be?

Note also that use libgit2 to sidestep not entirely reproducible behaviors in the git checkout and archive logic, such as smudge filters and export-ignore. This is solved simply and reliably by accessing the tree objects directly.

deliberately put a fallback git [command] into the closure

There's no doubt that we keep it as long as we need it. Performance can be a valid reason to keep it.

abathur commented 7 months ago

I've used a very teeny slice of libgit2 indirectly through python and rust bindings and my experience was that it's a mix when it comes to performance.

In particular, I found diffing was quite a bit slower (especially on big repos) via libgit2 than by shelling out to git. (Some other stuff was quite a bit faster, though.)

I think it just merits benchmarking each implementation against a good mix of repo sizes. Maybe an easy-to-run ~benchmarking suite pays off over time by warding off problems and encouraging people to experiment with optimizations.

Edit: there's also the rust gitoxide project. It's an alternative to the rust binding, but not a drop-in replacement. It also has a nascent gix cli, but IIRC it is likewise not a drop-in replacement. If there are any obvious performance hotspots, it might be worth exploring whether an equivalent implementation can use gitoxide or gix to see how they perform. The maintainer's pretty responsive with respect to tracking/prioritizing missing features to help onboard projects.

lf- commented 7 months ago

Here's an example of the pain it's caused with cargo: https://github.com/rust-lang/cargo/issues/11567#issuecomment-1380344790

It's plausible that nix's use case of pulling objects out of the store for a vfs would be faster on libgit2 and yet also that clone would be unacceptably slow without instead shelling out to git. I can believe that various use cases end up with very different perf results.

roberth commented 7 months ago

Regarding libgit2 performance, romkatv has made optimizations that may be interesting to upstream. https://github.com/libgit2/libgit2/pull/5044 attempted to upstream a number of them. Perhaps we could cherry-pick ones that are relevant to us and turn them into smaller PRs that are easier for upstream to merge? Their motivating use case is a highly optimized git status: https://github.com/romkatv/gitstatus/blob/master/README.md#why-fast

abathur commented 7 months ago

Their motivating use case is a highly optimized git status

Interesting. Hadn't spotted that before. Amusingly, my motivating case is more or less the same (I implemented lilgit as a more minimal alternative to gitstatus to see if i could claw back more time) :)

lorenzleutgeb commented 3 months ago

Related to:

lorenzleutgeb commented 3 months ago

I filed #10567 which goes in the opposite direction of this issue. Great input by @fricklerhandwerk makes me think that removing git as a runtime dependency is the way to go. However, on an IMO more basic level, there should be an interface for plugging fetchers. See also my https://github.com/NixOS/nix/pull/10567#issuecomment-2122209566. Commenting this here for reach regarding "pluggable fetchers".

roberth commented 3 months ago

When libgit2 supports remote helpers, #10567 and removing the git command dependency won't be in opposition.

I've added "Do not regress in functionality" to the issue description, because I'm certainly not advocating for breaking people's stuff for some minor technical reason.

lorenzleutgeb commented 3 months ago

When libgit2 supports remote helpers, #10567 and removing the git command dependency won't be in opposition.

Right. But do you have any indication that libgit2 would consider adding this, or even better someone working on it? I searched their repo but couldn't find anything. Don't get me wrong, I think it'd be very cool if it would support remote helpers, I am just genuinely asking.

roberth commented 3 months ago

I haven't talked to them about these features; I've only opened this issue to track what we'd need, because it came up in review and discussions. I don't think this issue is strategically important for Nix, but any efforts will be appreciated.

matthewbauer commented 2 months ago

It looks like libgit2 is used pretty extensively. I wonder what is remaining that calls git?

roberth commented 2 months ago

Most significantly it uses clone/fetch for performance and functionality; see Additional context for blockers. Also a signature check, and git add in the flakes CLI. Currently you can find the call sites with

git grep -E 'runProgram\("git"|.program = "git"'