libgit2 / libgit2sharp

Git + .NET = ❤
http://libgit2.github.com
MIT License
3.19k stars 888 forks source link

OutOfMemory Error with Repository.Diff.Compare<Patch>(...) #1411

Open vinayaugustine opened 7 years ago

vinayaugustine commented 7 years ago

I get an OutOfMemoryException from Repository.Diff with v0.22.0:

LibGit2Sharp.LibGit2SharpException: Out of memory
   at LibGit2Sharp.Core.Ensure.HandleError(Int32 result) in c:\Git\libgit2sharp\LibGit2Sharp\Core\Ensure.cs:line 160
   at LibGit2Sharp.Core.Proxy.git_patch_from_diff(DiffSafeHandle diff, Int32 idx) in c:\Git\libgit2sharp\LibGit2Sharp\Core\Proxy.cs:line 1555
   at LibGit2Sharp.PatchStats..ctor(DiffSafeHandle diff) in c:\Git\libgit2sharp\LibGit2Sharp\PatchStats.cs:line 31
   at LibGit2Sharp.Diff.<.cctor>b__1d(DiffSafeHandle diff) in c:\Git\libgit2sharp\LibGit2Sharp\Diff.cs:line 106
   at LibGit2Sharp.Diff.BuildDiffResult[T](DiffSafeHandle diff) in c:\Git\libgit2sharp\LibGit2Sharp\Diff.cs:line 120
   at LibGit2Sharp.Diff.Compare[T](Tree oldTree, Tree newTree, IEnumerable`1 paths, ExplicitPathsOptions explicitPathsOptions, CompareOptions compareOptions) in c:\Git\libgit2sharp\LibGit2Sharp\Diff.cs:line 246
   at LibGit2Sharp.Diff.Compare[T](Tree oldTree, Tree newTree) in c:\Git\libgit2sharp\LibGit2Sharp\Diff.cs:line 157
   at ABB.QueryManager.Models.Git.GitCodeChurnToCsvConverter.<GetRowData>d__7.MoveNext() in C:\Workspace\Source\TeamMetrics\QueryManager\QueryManager\Models\Git\GitCodeChurnToCsvConverter.cs:line 52

I'm trying to produce some basic code churn statistics from a Git repository. As some context, I'm implementing this in an ASP.NET application in a Hangfire job.

For each branch, I'm iterating over all of the commits and comparing them to their predecessor. I generate a patch per commit, and generate data for each file path in the commit. Here is the code I'm using:

foreach (var branch in repo.Branches.Where(b => b.Tip.Committer.When >= startDate))
{
    Commit nextCommit = null;
    foreach (var currentCommit in branch.Commits.Where(c => c.Committer.When >= startDate))
    {
        if(null == nextCommit)
        {
            foreach (var change in repo.Diff.Compare<Patch>(currentCommit.Tree, nextCommit.Tree))
            {
                var data = new string[]
                {
                    nextCommit.Id.Sha,
                    branch.FriendlyName,
                    change.Path,
                    change.Status.ToString(),
                    change.IsBinaryComparison.ToString(),
                    GetFileSize(nextCommit, change.Path).ToString(),
                    GetLineCount(nextCommit, change.Path).ToString(),
                    change.LinesAdded.ToString(),
                    change.LinesDeleted.ToString()
                }
                yield return data;
            }
        }
        nextCommit = currentCommit;
    }
}

I found this related Stack Overflow question: libgit2sharp.Patch outofmemory. I tried changing my code to produce both a TreeChanges and a PatchStats, but it fails as well.

carlosmn commented 7 years ago

Have you tried with 0.22.1 or the prereleases for 0.23? We had a bug in libgit2 where we'd allocate silly amounts of memory for trees in Windows, which I believe the native binaries for 0.22.0 had.

vinayaugustine commented 7 years ago

I upgraded to 0.23.1. I still get OutOfMemory error when I use Diff.Compare<Patch>(). However, I then switched to using Diff.Compare<TreeChanges>() and Diff.Compare<PatchStats>(). That's been running for ~5 hours with no error. Is there any reason Patch still fails?

Here's the updated version:

foreach (var currentCommit in Repository.Configuration.FilterCommits(branch.Commits))
{
    if(null != nextCommit)
    {
        var changes = repo.Diff.Compare<TreeChanges>(currentCommit.Tree, nextCommit.Tree);
        var stats = repo.Diff.Compare<PatchStats>(currentCommit.Tree, nextCommit.Tree);
        foreach (var change in changes)
        {
            var data = new string[]
            {
                nextCommit.Id.Sha,
                branch.FriendlyName,
                change.Path,
                change.Status.ToString(),
                "false", // TODO remove: only available with Patch
                GetFileSize(nextCommit, change.Path).ToString(),
                GetLineCount(nextCommit, change.Path).ToString(),
                stats[change.Path].LinesAdded.ToString(),
                stats[change.Path].LinesDeleted.ToString()
            };
            yield return data;
        }
    }
    nextCommit = currentCommit;
}
JasonDevStudio commented 3 years ago

@vinayaugustine Hello, I am also looking for a way to count the number of lines of code for each person through LibGit2Sharp. Have you already solved it? Can you tell me how to do it?