MisterGoodcat / GitFameSharp

MIT License
2 stars 1 forks source link

How to contribute ? #1

Open NinjaCross opened 5 months ago

NinjaCross commented 5 months ago

I like this project very much, and I'ld like to contribute. Would you be open to accept PR ?

MisterGoodcat commented 5 months ago

I've actually never thought about it, but if the contribution fits the spirit of the project, then why not. Before you put any effort into it, what is it you would like to change or add?

NinjaCross commented 5 months ago

I tinkered a bit on the code locally, to extend some of the existing functionalities. In particular, I found myself working on big projects composed by multiple repos, and with a lot of "noise" in the authors names (many devs use different formats for their author names when working on different machines). The code modifications are a bit crude, but work just fine. In particular I added the following.

  1. A AuthorNameEqualityComparer, passed to the Dictionary<string, AuthorStatistics> to merge together authors using a more robust approach

    public class AuthorNameEqualityComparer : IEqualityComparer<string>
    {
    public bool Equals(string x, string y)
    {
        return string.Equals(x?.Replace(" ", null).ToLower(), y?.Replace(" ", null).ToLower());
    }
    
    public int GetHashCode(string obj)
    {
        return obj?.Replace(" ", null).ToLower().GetHashCode() ?? 0;
    }
    }
  2. Implemented a CleanupAuthorName in order to infer the "real" author name based on simple decomposition rules, and most commonly used naming practices. This is just a mock, but works very well, and pushes even further the advantages provided by AuthorNameEqualityComparer

    
    public static string CleanAuthorName(string authorName)
    {
    if (string.IsNullOrWhiteSpace(authorName)) return "ANONYMOUS";
    
    authorName = authorName?.Replace("<", null).Replace(">", null).Trim();
    
    var indexOfAt = authorName.IndexOf("@", StringComparison.InvariantCultureIgnoreCase);
    if (indexOfAt > 0)
        authorName = authorName.Substring(0, indexOfAt);
    
    var indexOfSlash = authorName.IndexOf("\\", StringComparison.InvariantCultureIgnoreCase);
    if (indexOfSlash > 0)
    {
        var chunks = authorName.Split(new[] { '\\' }, StringSplitOptions.RemoveEmptyEntries);
        authorName = chunks.LastOrDefault() ?? authorName;
    }
    
    if (authorName.EndsWith("-ext", StringComparison.InvariantCultureIgnoreCase))
        authorName = authorName.Substring(0, authorName.Length - 4);
    
    if (authorName.EndsWith("-en", StringComparison.InvariantCultureIgnoreCase))
        authorName = authorName.Substring(0, authorName.Length - 3);
    
    if (authorName.EndsWith("-er", StringComparison.InvariantCultureIgnoreCase))
        authorName = authorName.Substring(0, authorName.Length - 3);
    
    authorName = authorName.Replace('-', ' ');
    
    authorName = string.Join(" ", authorName.Split(new[] { '.' }, StringSplitOptions.RemoveEmptyEntries).Select(a => a.Trim().ToLower()).Where(a => !string.IsNullOrWhiteSpace(a)));
    
    return authorName;
    }

This is used in `GitCommands.cs`, here:
```cs
   var result = await ExecuteGitAsync(s => s?.StartsWith(authorMarker) ?? false, "blame", "--line-porcelain", _gitOptions.Branch, file).ConfigureAwait(false);
            var authors = result
                .Select(x => CleanAuthorName(x.Substring(authorMarker.Length).Trim()))
                .ToList()
                .Aggregate(new Dictionary<string, int>(new AuthorNameEqualityComparer()), (dict, author) =>
  1. Possibility to specify multiple Git folders, in order to obtain a "compound report". Something like this:
public async Task RunAsync(CommandLineOptions originalOptions)
{
    var gitDirs = originalOptions.GitDir.Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries);
    var allReposAuthorStatistics = new List<ICollection<AuthorStatistics>>();
    for (var index = 0; index < gitDirs.Length; index++)
    {
        var gitDir = gitDirs[index];
        var options = new CommandLineOptions
        {
            AuthorsToMerge = originalOptions.AuthorsToMerge,
            GitDir = gitDir,
            Exclude = originalOptions.Exclude,
            Include = originalOptions.Include,
            Output = gitDirs.Length > 1 ? string.IsNullOrWhiteSpace(originalOptions.Output) ? $"{originalOptions.Output}_{index}.csv" : string.Empty : originalOptions.Output,
            Branch = originalOptions.Branch,
            ParallelBlameProcesses = originalOptions.ParallelBlameProcesses,
            VerboseOutput = originalOptions.VerboseOutput
        };

        var git = new GitCommands(options.GetGitOptions());
        var files = await git.GetFilesAsync().ConfigureAwait(false);

        var gitFileAnalyzer = new GitFileAnalyzer(options.GetFileAnalyzerOptions(), git);

        ICollection<AuthorStatistics> authorStatistics;

        using (var progressBar = CreateProgressBar(files.Count))
        {
            // ReSharper disable once AccessToDisposedClosure => false positive
            authorStatistics = await gitFileAnalyzer.BlameFilesAsync(files, progress => AdvanceProgressBar(progressBar, progress)).ConfigureAwait(false);

            var commitStatistics = await git.ShortlogAsync().ConfigureAwait(false);
            foreach (var commitStatistic in commitStatistics)
            {
                var authorStatistic = authorStatistics.SingleOrDefault(x => x.Author.Equals(commitStatistic.Key));
                if (authorStatistic == null)
                {
                    authorStatistic = new AuthorStatistics(commitStatistic.Key);
                    authorStatistics.Add(authorStatistic);
                }

                authorStatistic.CommitCount = commitStatistic.Value;
            }

            var merger = new AuthorsMerger(options.GetAuthorMergeOptions());
            authorStatistics = merger.Merge(authorStatistics);
        }

        WriteOutput(options, authorStatistics);
        DisplaySummary(authorStatistics);
        allReposAuthorStatistics.Add(authorStatistics);
    }

    if (allReposAuthorStatistics.Count > 1)
    {
        Console.WriteLine(Environment.NewLine + "Compount results:");
        // baseline using the first repo in the list
        var globalAuthorsStats = allReposAuthorStatistics.First();

        // ensure that the baseline contains ALL autors encountered in all repos
        foreach (var repo in allReposAuthorStatistics)
        {
            foreach (var author in repo)
            {
                if (globalAuthorsStats.All(gas => gas.Author != author.Author))
                {
                    globalAuthorsStats.Add(new AuthorStatistics(author.Author));
                }
            }
        }

        // remaining repos in the list
        var reposAuthorStats = allReposAuthorStatistics.Skip(1).ToArray();
        // foreach repo
        foreach (var currentRepo in reposAuthorStats)
        {
            // foreach author
            foreach (var authorStats in globalAuthorsStats.ToArray())
            {
                // stats of current user in current repo
                var foundSameAuthorInCurrentRepo = currentRepo.FirstOrDefault(ras => ras.Author == authorStats.Author);
                if (foundSameAuthorInCurrentRepo != null)
                {
                    var compoundAuthorStats = AuthorStatistics.MergeFrom(authorStats.Author, new[] { authorStats, foundSameAuthorInCurrentRepo });
                    globalAuthorsStats.Remove(authorStats);
                    globalAuthorsStats.Add(compoundAuthorStats);
                }
            }
        }

        WriteOutput(originalOptions, globalAuthorsStats);
        DisplaySummary(globalAuthorsStats);
    }
}
  1. Upgrade to NET 8.0

As you can see, nothing special, but these modifications have been extremely usefull to me. If you are interested, I could prepare a set of PR for you to approve. Otherwise no problems, I'll keep them locally or fork it for my personal use :)

NinjaCross commented 3 months ago

@MisterGoodcat did you have the chance to evaluate my proposal ?