crystal-lang / shards

Dependency manager for the Crystal language
Other
765 stars 100 forks source link

Leverage libgit2 instead of the git CLI tool #129

Open ysbaddaden opened 8 years ago

ysbaddaden commented 8 years ago

Maybe it would be better to use libgit2 (API) instead of relying on the presence of the external CLI tool. It could eventually lead to a better design of GitResolver, a simpler way to detect and report errors (e.g. refs not found) and portability (i.e. windows).

straight-shoota commented 6 years ago

There is a shard crystal-git/libgit2 with bindings, but it seems not to be actively maintained and stuck in the middle of a restructuring (https://github.com/crystal-git/libgit2/issues/2#issuecomment-338455484).

ysbaddaden commented 6 years ago

I created this issue 2 years ago, and I'm not sure it's such a good idea no. Anyway it won't happen anytime soon (or even later). Shards doesn't need external dependencies to bootstrap itself (apart from Crystal) and I want to keep it that way.

Closing for the time being. If someone is willing to try to integrate libgit2, please try it and report successes and issues :-)

skinnyjames commented 2 years ago

Hi all, I'm not sure if this path is a good one, but I've been working on library that wraps libgit2, and I plugged it into the git_resolver for this project. (Just cloning and fetching so far).

It's more of a proof a concept than a suggestion, but the tests are passing, which is nice.

If anyone was interested in checking it out, I PR'd against my fork of shards. https://github.com/skinnyjames-forks/shards/pull/1/files

skinnyjames commented 1 year ago

Following up on this, I accidentally made a PR against shards, but as Grits is moving forward I think it's worth revisiting this issue.

From the PR:

Benchmarks

Overall true git clones appear faster than libgit2, but libgit2 clones consume significantly less memory. If not employed already, concurrency may help.

A benchmark on a repo with many dependencies using hyperfine (gshards is the release binary using libgit2).

hyperfine 'rm -Rf lib && shards install --without-development' 'rm -Rf lib && gshards install --without-development'

Benchmark 1: rm -Rf lib && shards install --without-development
  Time (mean ± σ):      4.076 s ±  0.263 s    [User: 1.844 s, System: 1.789 s]
  Range (min … max):    3.847 s …  4.669 s    10 runs

Benchmark 2: rm -Rf lib && gshards install --without-development
  Time (mean ± σ):      5.929 s ±  0.827 s    [User: 0.721 s, System: 1.340 s]
  Range (min … max):    5.484 s …  8.264 s    10 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  'rm -Rf lib && shards install --without-development' ran
    1.45 ± 0.22 times faster than 'rm -Rf lib && gshards install --without-development'

Notes

straight-shoota commented 1 year ago

Statically linking libgit2 would remove a dependency (known)

Git require SSL and linking that statically is trouble. So even if libgit2 is linked statically, it will probably still need libssl as a dynamically loaded dependency.

That said, there are use cases where an additional dependency doesn't matter. For example, when shards is installed via a package manager, that manager can usually take care of having lib dependencies available.

So maybe a libgit2 based resolver can just be an optional alternative resolver to the CLI based resolver which can be used in situations where you want a fully self-contained executable that doesn't have any binary depdencies.

skinnyjames commented 1 year ago

Git require SSL and linking that statically is trouble. So even if libgit2 is linked statically, it will probably still need libssl as a dynamically loaded dependency.

Ah, makes sense. Thanks for speaking to it.

So maybe a libgit2 based resolver can just be an optional alternative resolver to the CLI based resolver which can be used in situations where you want a fully self-contained executable that doesn't have any binary depdencies.

That could be interesting. I think the biggest value add in using libgit2 might just be not having to shell out for everything - which could lead to a much easier time implementing and maintaining new features.

Clones are slower though, so I'm currently working on an experiment that sees if Crystal's concurrency mechanism can be applied to a custom transport. Will definitely keep this thread updated.

Another note that might be problematic is that Grits is linked against libgit2.so.1.3 specifically, because I noticed some of the APIs weren't compatible between versions. I'd love to remove this requirement by making it work with latest, but I haven't done so yet.

hugopl commented 1 year ago

I used libgit in one of my projects and what I could say is:

It breaks the API very often, so it's normal for me to upgrade my system then notice that I can't load my program anymore because it failed to load libgit.so... then I need to recompile it... OTOH git CLI is stable as a rock and works.

skinnyjames commented 1 year ago

It breaks the API very often, so it's normal for me to upgrade my system then notice that I can't load my program anymore because it failed to load libgit.so... then I need to recompile it... OTOH git CLI is stable as a rock and works.

Yeah. I've run into this too. Using libgit as a resolver would likely mean statically linking some version of libgit. I haven't had much success in using a custom transport, and I also haven't spent much time in making sense of the resolvers.

Since I can't seem to make clones faster either, I'm not sure the benefits are currently there. I'm happy to speak to any of the work I've done if somebody is interested in picking it up, but I probably won't be following up on this anymore, sorry.

straight-shoota commented 1 year ago

Yeah, linking libgit2 statically isn't easy either. I think mostly because it depends on libssl. That's a nightmare to link statically.