visevol / GithubVisualisation

PFE028 Été 2024
MIT License
0 stars 1 forks source link

[Backend] Implement line count on files #34

Closed zergov closed 5 days ago

zergov commented 6 days ago

My initial idea to get the line count of a SourceFile record was to sum the (addition - deletions) of all SourceFileChange of the file.

This is an abbreviated version of the git log command we use to index commits and file changes data:

git log --numstat --summary

If you sum the file change objects under a source file, the result doesn't match the actual line count of the file.


Experiments: Use this command in a second indexing pass:

git log -p -m --first-parent

Shows the history including change diffs, but only from the “main branch” perspective, skipping commits that comes from merged branches, and showing full diffs of changes
introduced by the merges. This makes sense only when following a strict policy of merging all topic branches when staying on a single integration branch.

^^ This outputs the real ledger of additions / deletions of files on the repository, but it doesn't contain all commits.

The idea is to:

  1. index all commits using the regular command
  2. run the second command to generate the list of commits that really matters for the line count of a file
  3. tag each commit of the output of the second command as: `marked_for_filesize
  4. To get the line count of a file, only sum the changes of the commits that are tagged.
zergov commented 5 days ago

Done in https://github.com/visevol/GihubVisualisation/pull/40