libgit2 / libgit2sharp

Git + .NET = ❀
http://libgit2.github.com
MIT License
3.12k stars 878 forks source link

Idea: Separate Low Level LibGit2 Wrapper and Higher Level Convenience Classes #1998

Open Danielku15 opened 1 year ago

Danielku15 commented 1 year ago

Looking at the pace of this repository it seems due to the lack of contributors it almost appears that there is not much happening on this library and it is not really in a production ready state.

Git/LibGit2 are advancing fast and obviously it is hard to keep up the pace providing a full .net flavor on top of that.

Hence I was thinking of the following: What if there would be two packages around libgit2sharp:

  1. libgit2sharp.Internals, libgitsharp2.Core or reuse the existing libgit2sharp.NativeLibs: which is a very low level wrapper around the C-API of libgit2. This whole wrapper would be auto-generated using the structs and function definitions of the native lib. It just allows you to inconviniently use the C-API in the .net world. This library should be almost instantly up-to-date with the native libgit2 due to its automatic build and code generation nature. It allows everyone who is advanced enough to use low level APIs of libgit2. Also if APIs are not yet exposed on the higher level library (point 2) you might be able to call some low level functions if you know how to handle it. In short, it is likely everything in https://github.com/libgit2/libgit2sharp/tree/master/LibGit2Sharp/Core but publicly exposed for users who are capable of using such low levels as autogenerated code + some manual helpers maybe to make marshalling easier. We could use https://github.com/mono/CppSharp for this. It is well maintained and used.
  2. libgit2sharp which provides the convenience APIs on top of the point 1 library which has proper native memory management. Nice APIs to interact with everything which needs time to design and implement manually. This library will likely be a bit slower to be up-to-date with the native lib because it needs manual development and organization.

We are concerned to use libgit2sharp in new projects due to its nature of not being maintained and uptodate. This implies risk of not getting bugfixes (feature and security wise) or new functionalities which are added to libgit2 in fast pace. I looked at how most .net tools currently interact with git and they seem to mostly rely on calling the git.exe and parse the outputs (I checked Visual Studio, Fork, GitHub Desktop and some other tools to evaluate how they do it) so we are left to either develop again an own framework for us, or rely on libgit2sharp which is struggling.

Let me know what you think of such an idea?

ethomson commented 1 year ago

The structure that you've proposed is roughly what I've tackled in Dogged.

I played a bit with using codegen to build the the PInvoke layer, but I haven't had a lot of success yet. But I think that it's definitely the right approach, it's just something that I haven't really "nailed" yet so to speak.

ethomson commented 1 year ago

(I think that https://github.com/xoofx/GitLib.NET may be a good starting point.)

Danielku15 commented 1 year ago

The Dogged repo looks quite interesting. Would be good to hear a bit more about your long term plans around this lib and the relation to libgit2sharp. Will likely give it a shot to check what is there already and look if I could contribute some code generation aspects. If this project gets traction would be worth listing it on https://github.com/libgit2/libgit2#language-bindings (maybe for discussion over at the Dogged Repo)

GitLib.NET might be a good reference to see how they solved some aspects, but it is also rather a dead project 😞

astrohart commented 1 year ago
Attempting to gather dependency information for package 'LibGit2Sharp.NativeBinaries.2.0.315' with respect to project 'MassCloner', targeting '.NETFramework,Version=v4.8'
Gathering dependency information took 28 ms
Attempting to resolve dependencies for package 'LibGit2Sharp.NativeBinaries.2.0.315' with DependencyBehavior 'Lowest'
Unable to resolve dependencies. 'LibGit2Sharp.NativeBinaries 2.0.315' is not compatible with 'LibGit2Sharp 0.27.0-preview-0182 constraint: LibGit2Sharp.NativeBinaries (= 2.0.315-alpha.0.9)'.

I echo the sentiments in this issue. The process of updating the LibGit2Sharp NuGet Package is really cumbersome because there is a philosophy of breaking the update concordance between this package and the LibGit2Sharp.NativeBinaries package -- causing this message to appear and breaking the functionality of the Update button in the NuGet Package Manager window in Visual Studio.

Really wish this would get addressed, although, I am skeptical because I recall being told that there are no plans to address this. It is a severe usability issue.

ethomson commented 1 year ago

Really wish this would get addressed, although, I am skeptical because I recall being told that there are no plans to address this. It is a severe usability issue.

You're saying, in effect, why shouldn't you be able to update to the latest version of the native binaries package independent of the version of LibGit2Sharp that you use?

This seems like a mismatch in expectations...

Because the native binaries is the packaging for the native libraries, and LibGit2Sharp contains the pinvoke layer. Trying to use mismatched versions would crash your app hard.

Isn't this often true for transitive dependencies, though? There's no guarantee that upgrading a thing that a thing that you use depends on will work? It's just more true for native dependencies where you crash hard instead of throwing.

Does nuget have a version compatibility expression more like npm's? Is LibGit2Sharp not using it? If so, that should be able to resolve it.

But maybe I don't understand - what's the actual use case for having mismatched versions?

Danielku15 commented 1 year ago

I agree to @ethomson that with the current situation this might be less a libgit2sharp problem than more a IDE problem. It it shouldn't offer an upgrade button in cases where dependency graphs cannot be fullfilled with an upgrade. In this case: For LibGit2Sharp.NativeBinaries VS/Rider should not provide an upgrade button if a LibGit2Sharp requires a certain lower version. But unfortunately this dependency is often only checked as part of the upgrade process itself leading to the mentioned error.

On the other hand LibGit2Sharp with all NuGets could adopt at least a bit the versioning/deployment strategy to consider following aspects:

  1. Only release LibGit2Sharp.NativeBinaries as pre-release packages before a matching LibGit2Sharp main package is available. It should be rather obvious that I cannot upgrade some transitive dependencies to future pre-release versions.
  2. Unfortunately libgit2 and the wrappers might not be practicing strict SemVer which would allow upgrading to certain minor versions because compatibility should be guaranteed.
  3. Going forward on separating the Pinvoke layer and the API layer it might actually become more realistic to upgrade the Pinvoke layer while the API layer stays "old". The Pinvoke ensure compatibility with the native layer so memory layouts of structs and call signatures are correct. DotNet is a lot more forgiving in terms of non-breaking changes like new options (assembly redirects) which would allow a certain level of compatibility.

So somehow I agree with both of you 😁

benblo commented 1 year ago

I agree with the "split low-/high-level" sentiment; libgit2 itself is made for low-level, but some parts of LibGit2Sharp operate at a much higher level, without exposing the low-level.

Eg I've recently wanted to use git_graph_descendant_of, which I couldn't find; I had to go through the source code to find that repo.Refs.ReachableFrom was based around it. So I can basically find if A is a descendant of B, if I reverse the logic and go through hoops wrapping stuff in IEnumerable Refs and Commits, instead of good old ids... but, I'd rather DescendantOf was exposed directly. The Proxy layer should be available somehow. The Native layer seem to be about mostly about marshalling so it seems good that it's kept separate, public or not.

Also, some things are slower than they should be, eg QueryBy(path) (FileHistory internally) is so slow as to be unusable for my usecase, while a git log on the same repo/path is basically instant. That's fishy.

A glance at the code feels like the high-level is relying a bit too much on managed stuff, eg:

... at high bandwidth, it's hard to imagine these things don't have an impact. Note that I'm a gamedev by day, I could be paranoid for being too used to think in milliseconds; I've not benchmarked anything (wouldn't really know how), maybe something like git is so IO-bound that memory bandwidth doesn't matter.

jairbubbles commented 9 months ago

@Danielku15 Sounds indeed like a good idea! Personally I would make it leave outside the scope of this repo as a fresh new start while keeping in mind that LibGit2Sharp could at the end switch to it.

That's sad that a common initiative doesn't emerge. It's a pattern that I often see in the .NET ecosystem, new projects often come as personal initiatives and they're hard to sustain. Wouldn't it be great if @xoofx and @ethomson would collaborate on the same project? 🀩

xoofx commented 9 months ago

GitLib.NET might be a good reference to see how they solved some aspects, but it is also rather a dead project 😞 Wouldn't it be great if @xoofx and @ethomson would collaborate on the same project? 🀩

Let me give my perspective on this topic πŸ™‚

jairbubbles commented 9 months ago

Thanks @xoofx for the details!

Even if code generation would be great, I don't think it's mandatory. I would just help maintenance. Also, code generation could be used as a first draft and then it would be polished at hand. Finally, it doesn't tackle the fact that we need tests (cf https://github.com/xoofx/GitLib.NET/issues/2).

When you look at PRs here, a lot of them is about missing API from libigit2, people just add them (along with some tests) it's pretty straight forward. The main issue is that at the end they don't get merged (for good or bad reasons).

xoofx commented 9 months ago

Even if code generation would be great, I don't think it's mandatory. I would just help maintenance. Also, code generation could be used as a first draft and then it would be polished at hand. Finally, it doesn't tackle the fact that we need tests (cf https://github.com/xoofx/GitLib.NET/issues/2).

Agreed, it's not mandatory, and yes, maintenance wise is a major benefit of doing it automatically, but not only: if you really want to have consistency across the board, API documentation at least replicated, tracking changes new api/obsoleted API...etc.

But anyway, the main problem is less about manual vs automatic but about your 2nd point (which was my last point 😁)

When you look at PRs here, a lot of them is about missing API from libigit2, people just add them (along with some tests) it's pretty straight forward. The main issue is that at the end they don't get merged (for good or bad reasons).

Firstly, I don't think that LibGit2Sharp is a good place to introduce this, as it has too much legacy...

but more importantly, to have a low-level managed wrapper of libgit2, you need to have at least one folk (of course more is always better) that definitely needs this for a living (e.g strong dependence in a company) or for a super important personal project and would be ready to spend a few months to provide this. πŸ™‚

jairbubbles commented 9 months ago

Firstly, I don't think that LibGit2Sharp is a good place to introduce this, as it has too much legacy...

What about Dogged? I've personally discovered it yesterday. Looks like a big step forward compared to LibGit2Sharp.

xoofx commented 9 months ago

What about Dogged? I've personally discovered it yesterday. Looks like a big step forward compared to LibGit2Sharp.

I usually only care about the low level part so that would be more Dogged.Native. Most abstractions are going to cripple performance with managed allocations all around. It takes also more time to update such library, because not only you have to provide the low level part, but you need to think about how to shape the abstraction. It is a lot more work.

Also the low level part is using raw pointers while I personally prefer to have struct handles (as I did in GitLib.NET, if a struct is used only through a pointer and doesn't have fields, then you can turn that struct itself in an abstract pointer - a struct wrapping a pointer, and you get rid of pointers in all APIs).

Also, you can see that it is very laborious to get a proper coverage of the native API. Dogged here covers a fraction of the native API, and when you bring such API, you need to bring also the abstraction (As we can see with opened PRs)

That's why, maintenance wise, doing that manually is more a problem, as you have to go through everything single details like that and make PR for each missing API and in the end, it requires more work in the long run and context switching (also for the maintainer). In OSS, if a project requires very frequent gardening to get basic stuffs, that becomes quickly problematic.

xoofx commented 1 month ago

As I was working again on my codegen tooling from C++ for other libraries, I gave libgit2 a 2nd try: https://github.com/XenoAtom/XenoAtom.Interop/tree/main/src/libgit2 ☺️