Open carlosparadis opened 1 year ago
A simple step Nicole pointed out for now is parameterize the "master" branch in:
https://github.com/sailuh/kaiaulu/blob/6ff61e558cb2b2a658dc90c93649d41ab3f30022/R/parser.R#L532
Quoting Nicole here:
Regarding the branch flag for perceval in parse_gitlog(): as you said, i realized that this function throws an error if you did not check out the branch you would like to analyse in your local git repo. but in case the desired branch is checked out, it only extracts the information from this branch (e.g. if you specify apache/apr branch "evenset" and checked this out, it only inspects this branch and not more). to make this work, the branch must be added as a flag in the calls with and without regexp filtering (https://github.com/sailuh/kaiaulu/blob/master/R/parser.R#L530-L538).
Of the above my understanding is that:
In Nicole example, a branch
parameter will replace the master
hardcoded string, and you would have passed the eventset
branch. If you do that, then from the quote the git log table obtained are only of commits made to that branch.
My original intent on warning for the git_checkout
was actually another. Kaiaulu has the parse_gitlog
but also a parse_dependencies
. The former goes after .git
folder with Perceval. The latter goes after the src
folder with Depends. The interface to DV8 tool (DV8.R) that analyses architectural flaws requires information originating from both.
What I originally meant was that, beyond the user making sure they correctly specify the branch parameter (item 3 above), they must also call git_checkout
in Kaiaulu, otherwise their parsed .git
will be about one branch, and the src
will be about another branch. This would lead to a silent horrible erronic result, and it is hard to observe.
The DV8 Notebook probably needs to be updated to reflect that, and/or any Notebook associated to this kind of analysis.
I think one more thing we need to account for is if the user wants to consider looking at more than one branch. Can we specify more than just one branch on our Perceval call, or two or more calls to parse_gitlog
would be required? Considering Perceval does not seem to state from which branch the commit originates, perhaps the multi-call to parse_gitlog()
where one column is introduced to the output table with the branch name may be preferred to rbind after.
The last consideration on this is if we ever want to replace parse_gitlog() for another tool to parse the git log in the future and how this would look on the project configuration file.
We have to also consider any of the other flags beside the branch:
https://github.com/sailuh/kaiaulu/blob/6ff61e558cb2b2a658dc90c93649d41ab3f30022/R/parser.R#L518-L535
For example, Wolfgang seem to have also deferred figuring the implications of some of these flags like the -C and -M:
Various flags are passed alongside the --numstat flag to Git. This may alter the behavior of the code. We should also generalize the notebook to account for more than one branch.
See: https://github.com/sailuh/kaiaulu/issues/184#issuecomment-1525478233 for details