Open NinjaCross opened 5 months ago
I've actually never thought about it, but if the contribution fits the spirit of the project, then why not. Before you put any effort into it, what is it you would like to change or add?
I tinkered a bit on the code locally, to extend some of the existing functionalities. In particular, I found myself working on big projects composed by multiple repos, and with a lot of "noise" in the authors names (many devs use different formats for their author names when working on different machines). The code modifications are a bit crude, but work just fine. In particular I added the following.
A AuthorNameEqualityComparer
, passed to the Dictionary<string, AuthorStatistics>
to merge together authors using a more robust approach
public class AuthorNameEqualityComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
return string.Equals(x?.Replace(" ", null).ToLower(), y?.Replace(" ", null).ToLower());
}
public int GetHashCode(string obj)
{
return obj?.Replace(" ", null).ToLower().GetHashCode() ?? 0;
}
}
Implemented a CleanupAuthorName
in order to infer the "real" author name based on simple decomposition rules, and most commonly used naming practices. This is just a mock, but works very well, and pushes even further the advantages provided by AuthorNameEqualityComparer
public static string CleanAuthorName(string authorName)
{
if (string.IsNullOrWhiteSpace(authorName)) return "ANONYMOUS";
authorName = authorName?.Replace("<", null).Replace(">", null).Trim();
var indexOfAt = authorName.IndexOf("@", StringComparison.InvariantCultureIgnoreCase);
if (indexOfAt > 0)
authorName = authorName.Substring(0, indexOfAt);
var indexOfSlash = authorName.IndexOf("\\", StringComparison.InvariantCultureIgnoreCase);
if (indexOfSlash > 0)
{
var chunks = authorName.Split(new[] { '\\' }, StringSplitOptions.RemoveEmptyEntries);
authorName = chunks.LastOrDefault() ?? authorName;
}
if (authorName.EndsWith("-ext", StringComparison.InvariantCultureIgnoreCase))
authorName = authorName.Substring(0, authorName.Length - 4);
if (authorName.EndsWith("-en", StringComparison.InvariantCultureIgnoreCase))
authorName = authorName.Substring(0, authorName.Length - 3);
if (authorName.EndsWith("-er", StringComparison.InvariantCultureIgnoreCase))
authorName = authorName.Substring(0, authorName.Length - 3);
authorName = authorName.Replace('-', ' ');
authorName = string.Join(" ", authorName.Split(new[] { '.' }, StringSplitOptions.RemoveEmptyEntries).Select(a => a.Trim().ToLower()).Where(a => !string.IsNullOrWhiteSpace(a)));
return authorName;
}
This is used in `GitCommands.cs`, here:
```cs
var result = await ExecuteGitAsync(s => s?.StartsWith(authorMarker) ?? false, "blame", "--line-porcelain", _gitOptions.Branch, file).ConfigureAwait(false);
var authors = result
.Select(x => CleanAuthorName(x.Substring(authorMarker.Length).Trim()))
.ToList()
.Aggregate(new Dictionary<string, int>(new AuthorNameEqualityComparer()), (dict, author) =>
public async Task RunAsync(CommandLineOptions originalOptions)
{
var gitDirs = originalOptions.GitDir.Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries);
var allReposAuthorStatistics = new List<ICollection<AuthorStatistics>>();
for (var index = 0; index < gitDirs.Length; index++)
{
var gitDir = gitDirs[index];
var options = new CommandLineOptions
{
AuthorsToMerge = originalOptions.AuthorsToMerge,
GitDir = gitDir,
Exclude = originalOptions.Exclude,
Include = originalOptions.Include,
Output = gitDirs.Length > 1 ? string.IsNullOrWhiteSpace(originalOptions.Output) ? $"{originalOptions.Output}_{index}.csv" : string.Empty : originalOptions.Output,
Branch = originalOptions.Branch,
ParallelBlameProcesses = originalOptions.ParallelBlameProcesses,
VerboseOutput = originalOptions.VerboseOutput
};
var git = new GitCommands(options.GetGitOptions());
var files = await git.GetFilesAsync().ConfigureAwait(false);
var gitFileAnalyzer = new GitFileAnalyzer(options.GetFileAnalyzerOptions(), git);
ICollection<AuthorStatistics> authorStatistics;
using (var progressBar = CreateProgressBar(files.Count))
{
// ReSharper disable once AccessToDisposedClosure => false positive
authorStatistics = await gitFileAnalyzer.BlameFilesAsync(files, progress => AdvanceProgressBar(progressBar, progress)).ConfigureAwait(false);
var commitStatistics = await git.ShortlogAsync().ConfigureAwait(false);
foreach (var commitStatistic in commitStatistics)
{
var authorStatistic = authorStatistics.SingleOrDefault(x => x.Author.Equals(commitStatistic.Key));
if (authorStatistic == null)
{
authorStatistic = new AuthorStatistics(commitStatistic.Key);
authorStatistics.Add(authorStatistic);
}
authorStatistic.CommitCount = commitStatistic.Value;
}
var merger = new AuthorsMerger(options.GetAuthorMergeOptions());
authorStatistics = merger.Merge(authorStatistics);
}
WriteOutput(options, authorStatistics);
DisplaySummary(authorStatistics);
allReposAuthorStatistics.Add(authorStatistics);
}
if (allReposAuthorStatistics.Count > 1)
{
Console.WriteLine(Environment.NewLine + "Compount results:");
// baseline using the first repo in the list
var globalAuthorsStats = allReposAuthorStatistics.First();
// ensure that the baseline contains ALL autors encountered in all repos
foreach (var repo in allReposAuthorStatistics)
{
foreach (var author in repo)
{
if (globalAuthorsStats.All(gas => gas.Author != author.Author))
{
globalAuthorsStats.Add(new AuthorStatistics(author.Author));
}
}
}
// remaining repos in the list
var reposAuthorStats = allReposAuthorStatistics.Skip(1).ToArray();
// foreach repo
foreach (var currentRepo in reposAuthorStats)
{
// foreach author
foreach (var authorStats in globalAuthorsStats.ToArray())
{
// stats of current user in current repo
var foundSameAuthorInCurrentRepo = currentRepo.FirstOrDefault(ras => ras.Author == authorStats.Author);
if (foundSameAuthorInCurrentRepo != null)
{
var compoundAuthorStats = AuthorStatistics.MergeFrom(authorStats.Author, new[] { authorStats, foundSameAuthorInCurrentRepo });
globalAuthorsStats.Remove(authorStats);
globalAuthorsStats.Add(compoundAuthorStats);
}
}
}
WriteOutput(originalOptions, globalAuthorsStats);
DisplaySummary(globalAuthorsStats);
}
}
As you can see, nothing special, but these modifications have been extremely usefull to me. If you are interested, I could prepare a set of PR for you to approve. Otherwise no problems, I'll keep them locally or fork it for my personal use :)
@MisterGoodcat did you have the chance to evaluate my proposal ?
I like this project very much, and I'ld like to contribute. Would you be open to accept PR ?