libgit2 / libgit2sharp

Git + .NET = ❤
http://libgit2.github.com
MIT License
3.2k stars 889 forks source link

Feature: Offer CoreCLR-compatible portable library #1293

Closed AArnott closed 7 years ago

AArnott commented 8 years ago

As discussed in #874:

I'm willing to take a look at converting libgit2sharp to a portable library that can run on CoreCLR. My exploration is in the portable branch of my fork.

I'd like an active issue to track this exploration where I can share ideas for making this work with the project owners, and verify owners' willingness to ultimately accept a PR when the work is done.

Work ongoing in PR #1318

Issue Proposal/Resolution
Maintain net40 compatibility Maintain the net40 targeted class library. Move source code to a Shared Project and add a Portable library project to add support for net45 genre portable library
Serializable exceptions Put #if blocks around desktop-only serializable support
ICustomMarshaler CoreCLR does not support this attribute. Replace all uses with thunking method wrappers. Keep tax and extra code very low by leveraging code gen during build, the way the PInvoke project does.
ReliabilityContract Put #if blocks around these
X509Certificate Offered via a corefx nuget package
SecureString Offered via corefx NuGet package
BufferedStream Cheap portable imitation
CriticalFinalizerObject Put #if around base class declaration
Environment.OSVersion #if Desktop keep, #else use CoreCLR RuntimeInformation
Trace Put #ifblocks around these

Code generation design

The p/invoke signature that is hand-written changes from this:

[DllImport(libgit2)]
internal static extern unsafe int git_blob_create_fromchunks(
    ref GitOid oid,
    git_repository* repositoryPtr,
    [MarshalAs(UnmanagedType.CustomMarshaler, MarshalCookie = UniqueId.UniqueIdentifier, MarshalTypeRef = typeof(StrictFilePathMarshaler))] FilePath hintpath,
    source_callback fileCallback,
    IntPtr data);

To this:

[DllImport(libgit2)]
internal static extern unsafe int git_blob_create_fromchunks(
    ref GitOid oid,
    git_repository* repositoryPtr,
    [CustomMarshaler(typeof(StrictFilePathMarshaler))] byte* hintpath,
    source_callback fileCallback,
    IntPtr data);

Notice the change from using the MarshalAs attribute to using CustomMarshaler. This attribute causes another overload of this method to be automatically generated during the build (and available to Intellisense):

internal static unsafe int git_blob_create_fromchunks(ref GitOid oid, git_repository* repositoryPtr, string hintpath, source_callback fileCallback, IntPtr data)
{
    ICustomMarshaler marshaler = new StrictFilePathMarshaler();
    IntPtr p_hintpath = marshaler.MarshalManagedToNative(hintpath);
    try
    {
        return git_blob_create_fromchunks(ref oid, repositoryPtr, (byte*)p_hintpath, fileCallback, data);
    }
    finally
    {
        marshaler.CleanUpNativeData(p_hintpath);
    }
}

Notice this generated overload uses string as the marshaled parameter type, keeping the ease of use that we had before. It then emulates what the desktop marshaler would have done (at least, as far as I understand it).

ethomson commented 8 years ago

and verify owners' willingness to ultimately accept a PR when the work is done.

I'd be happy to see CoreCLR compatibility, and we'd accept a PR as long as it doesn't reduce compatibility elsewhere (which I don't think that it would).

(Probably makes sense to discuss those details in a PR.)

AArnott commented 8 years ago

Great. To keep this issue (which could get long) readable, I'll keep a running list of issues and their resolutions in the initial description. Then we can comment and discuss below, and I'll update with decisions in the description.

AArnott commented 8 years ago

Can someone please review the custom marshaler solution I've described in the description above?

ethomson commented 8 years ago

I'll keep a running list of issues and their resolutions in the initial description

Project owners, subscribers, etc don't get notifications when you edit the initial description - so while that's fine to keep a summary, please do also follow up with a comment. :)

ethomson commented 8 years ago

Notice the change from using the MarshalAs attribute to using CustomMarshaler. This attribute causes another overload of this method to be automatically generated during the build (and available to Intellisense)

What if a method takes 5 strings? Does it make two overloads (one with all byte* and a second with all string)? Or does it multiply out of control?

What actually generates the code? Can you point me to a similar example? Is it an MSBuild task or something?

AArnott commented 8 years ago

please do also follow up with a comment. :)

Yes, I'll do that

What if a method takes 5 strings? Does it make two overloads (one with all byte* and a second with all string)? Or does it multiply out of control?

We have total control over it. The goal is just one overload is generated (making 2 in total).

What actually generates the code? Can you point me to a similar example? Is it an MSBuild task or something?

It's a Roslyn-based code generation framework, with custom generation code specifically for libgit2sharp added to it. The PInvoke project is an example of a shipping project that uses this technique in order to define very low level overloads (using char* or byte*) and get friendly overloads generated for the more common use cases. For PInvoke, we do want multiple overloads generated in some cases. In libgit2sharp's case, we'll want just one (reproducing the original p/invoke before I modify it).

I've completed the code generator for libgit2sharp and will push very soon so you can actually see it working on your own project.

AArnott commented 8 years ago

I've pushed the code generator and a significant set of altered p/invoke signatures to my branch.

AArnott commented 8 years ago

I'm trying to understand how this CustomMarshaler works for out parameters and return values. Who frees the native memory? It seems the custom marshaler does, but that seems to violate the best practice of always using the same library to free the memory as was used to allocate it. Yet I don't see any code to call back into the library to free this memory.

Also, I did some reading on use of custom marshalers and the MSDN docs seem to fixate on them being used to marshal COM objects rather than other values for P/Invoke signatures. Is this even a proper use?

AArnott commented 8 years ago

When this change is complete, should I expect it to merge into master or vNext?

ethomson commented 8 years ago

I'm trying to understand how this CustomMarshaler works for out parameters and return values. Who frees the native memory?

libgit2 does. If libgit2 is allocating the memory, it will also free it. If you give me an example of a function that you're looking at, then I can provide more details about the lifecycle, but (for example) calling git_diff_free will free the filenames that were returned in the diff deltas and that were marshalled.

Is this even a proper use?

By what metric are you defining "proper use"?

When this change is complete, should I expect it to merge into master or vNext?

master is now our main branch.

AArnott commented 8 years ago

If you give me an example of a function that you're looking at, then I can provide more details about the lifecycle

I was looking at git_repository_path, for example. It returns a pointer to a buffer that you marshal to a FilePath. But how is the memory it points to freed? I don't see a free function. Is it pointing to memory that is allocated with the repository and thus guaranteed to live exactly as long as the repo, and we're just copying the memory into a managed string during that time?

By what metric are you defining "proper use"?

Well, between MSDN only talking about it in the context of COM, and not supporting it on CoreCLR which has no support for COM but does support p/invokes, it seems like this attribute is being used where it was not designed to, and thus leaves open the question in my mind as to whether it's possible for it to marshal memory properly. For example, if it only marshals COM objects, then freeing memory isn't an issue since COM objects release themselves based on reference counting. But if you're marshaling other values as you're doing here, then the attribute has to somehow know how to call back into the native library to free the memory, yet as a marshaling attribute I don't see how it can do so.

ethomson commented 8 years ago

Is it pointing to memory that is allocated with the repository and thus guaranteed to live exactly as long as the repo, and we're just copying the memory into a managed string during that time?

Yes, exactly so.

But if you're marshaling other values as you're doing here, then the attribute has to somehow know how to call back into the native library to free the memory, yet as a marshaling attribute I don't see how it can do so.

It doesn't need to. We make explicit calls to our various free functions, like git_repository_free, git_diff_free, etc.

At the moment, we typically do not hold on to the native objects, and so the pattern is to create a managed diff object, copy all the data in and then release the native object immediately. So we very quickly free the associated native memory. This is an approach that we're revisiting, but that's a detail that really doesn't matter to the marshalling.

Anyway, if you're asking for git_repository_workdir, then you get a pointer that is freed in git_repository_free. You don't get a copy of that string that you need to work with specially.

AArnott commented 8 years ago

Thanks. I think that's all I need at the moment. Have you had a chance to look at my branch, open it up in VS, and get a feel for how it works (with code gen, etc)?

AArnott commented 8 years ago

I've got the portable project building. No more compile errors. I haven't run tests yet.

Before I spend much more time on this, I'd like a pre-review of the changes as a course check so I'm not wasting time going in an unfruitful direction. Can a committer please sign up to take a look? I can file a PR for purposes of review and discussion if that's helpful.

ethomson commented 8 years ago

Sure, @AArnott , I'll try to take a look tomorrow evening.

AArnott commented 8 years ago

:eyes:

AArnott commented 8 years ago

Ping :)

AArnott commented 8 years ago

I've got all the tests passing for the desktop and portable libraries! (modulo the 3 that are already failing in master).

ethomson commented 8 years ago

Thanks for staying on top of this @AArnott . My Windows VM apparently konked out and I had to reinstall this weekend. (Fast ring of death?) In any case, I'm going to try to make some time for this again this weekend.

Side note: you have three tests failing in master? That's unexpected. Presumably some local configuration on your machine that we're failing to sandbox for correctly. Would you be so kind as to open an issue?

AArnott commented 8 years ago

Bugs filed for failing tests: #1312 #1313 #1314

Thanks for looking.

pawchen commented 8 years ago

👍 , commenting to get notification.

AArnott commented 8 years ago

@diryboy you can just click the "Subscribe" button on the right.

pawchen commented 8 years ago

@AArnott Thanks but that button would show 'Unsubscribe' if you already watched the repository, so I just don't trust it.

ah- commented 8 years ago

As part of this it would be nice to also distribute the native binaries in a new-style package, I created an issue here: https://github.com/libgit2/libgit2sharp.nativebinaries/issues/39

AArnott commented 8 years ago

@ah- that might be nice, but I wouldn't want to do it "as part of this" lest we make this already big project significantly bigger. Also, libgit2sharp currently supports older versions of VS (I believe) that don't support the new package layout. So adopting the new one would mean cutting off some users.

ah- commented 8 years ago

Yes, makes sense to do that bit later.

The new package layout is fully compatible without cutting off any users of old tooling if the build.props for them are adjusted to the new paths. But I see that this will require some actual testing, which can be done after this has been merged.

Jaykul commented 8 years ago

What ever happened to this? With PowerShell going cross-platform, I was looking for .NetStandard version of the library ;-)

shiftkey commented 8 years ago

@Jaykul it's in flight, but it's been a tough review to slog through so it's kind of stalled: https://github.com/libgit2/libgit2sharp/pull/1318

pdemro commented 8 years ago

Looking forward to this PR. Thank you for your efforts guys.

Trolldemorted commented 7 years ago

does this issue cover support for the W10 UWP, or should i open a new one?

Alxandr commented 7 years ago

Any updates on this?

bording commented 7 years ago

It's almost ready: https://github.com/libgit2/libgit2sharp/pull/1318

@ethomson wants to release another version first before we merge it in.