My initial idea to get the line count of a SourceFile record was to sum the (addition - deletions) of all SourceFileChange of the file.
This is an abbreviated version of the git log command we use to index commits and file changes data:
git log --numstat --summary
If you sum the file change objects under a source file, the result doesn't match the actual line count of the file.
Experiments: Use this command in a second indexing pass:
git log -p -m --first-parent
Shows the history including change diffs, but only from the “main branch” perspective, skipping commits that comes from merged branches, and showing full diffs of changes
introduced by the merges. This makes sense only when following a strict policy of merging all topic branches when staying on a single integration branch.
^^ This outputs the real ledger of additions / deletions of files on the repository, but it doesn't contain all commits.
The idea is to:
index all commits using the regular command
run the second command to generate the list of commits that really matters for the line count of a file
tag each commit of the output of the second command as: `marked_for_filesize
To get the line count of a file, only sum the changes of the commits that are tagged.
My initial idea to get the line count of a
SourceFile
record was to sum the (addition - deletions) of allSourceFileChange
of the file.This is an abbreviated version of the git log command we use to index commits and file changes data:
If you sum the file change objects under a source file, the result doesn't match the actual line count of the file.
Experiments: Use this command in a second indexing pass:
^^ This outputs the real ledger of additions / deletions of files on the repository, but it doesn't contain all commits.
The idea is to: