captainhookphp / captainhook

CaptainHook is a very flexible git hook manager for software developers that makes sharing git hooks with your team a breeze.
http://captainhook.info
MIT License
996 stars 86 forks source link

[FEATURE-REQUEST] New placeholder `ALL_CHANGED_FILES` #235

Closed Eydamos closed 8 months ago

Eydamos commented 8 months ago

I have run into another edge case. For me it is common to amend commits and force push the changes to the remote. If I use the CHANGED_FILES placeholder it will only list the files listed in the diff between the local hash and the remote hash. As git does not tell the hook that it is a force push there is no way around this behaviour.

Therefore I would propose the following solution: If you would add a ALL_CHANGED_FILES placeholder that always uses the hash from the reflog as the $oldHash for comparison (just like you now do if the oldHash is just zeros) it would give me always a list of all files that have changed in the current branch compared to its original branch.

For my use-case I basically only want to make sure that when you push, one of the files that was changed between the current branch and the branch it was created from is the CHANGELOG.md

sebastianfeldmann commented 8 months ago

That's strange. Force pushing works fine for me.

Git puts the following into the stdIn

refs/heads/main a74336878c058b6902df202db2f2253868cf2421 refs/heads/main 6e1fcae8e07c60775a9d777cbcfe54b3c63566d8

a74336 is the new commit hash 6e1fcae is the amended old hash.

The detected revision range 6e1fcae..a74336 works just fine to detect the files changed made with the amend.

Could you list a more detailed list of commands you execute to reproduce the problem?

Eydamos commented 8 months ago

Example: I change the following files:

I commit and push Then I realise I have a typo in my some_script.php so I fix it and the amend the commit and force push The CHANGED_FILES now only contains some_script.php because that is the difference between the local hash and the remote hash

sebastianfeldmann commented 8 months ago

And I would think that's correct :)

Because CHANGELOG.md is already on the remote in exactly that state. So if you have checks on .md files it should have been checked before.

If you use changes to .md files to trigger some other local workflow I would suggest writing a custom Condition or Action to implement the behavior you need.

Eydamos commented 8 months ago

Of course this is correct. Which is why I requested a ALL_CHANGED_FILES placeholder which shows all changed files since the creation of the branch. I at least see a use case for this and unfortunately there is also no way how I could add custom placeholders.

If you think this is not useful this is totally fine. Then I will just write a whole custom action which get's the changes from the reflog. As I have both the repository and IO it should be possible.

I just want to make sure that in your current branch you have modified the CHANGELOG.md file as sometimes people forget this and sometimes you also do not realised this in the PR and merge it without the CHANGELOG being changed

{
    "pre-push": {
        "enabled": true,
        "actions": [
            {
                "action": "echo {$ALL_CHANGED_FILES|of-type:md} | grep 'CHANGELOG.md'",
                "config": {
                    "label": "Check for 'CHANGELOG.md'"
                },
                "conditions": [
                    {
                        "exec": "\\Company\\Project\\Condition\\NotOnBranch",
                        "args": [
                            "master"
                        ]
                    }
                ]
            }
        ]
    }
}
sebastianfeldmann commented 8 months ago

Uhhh I like this

                "conditions": [
                    {
                        "exec": "\\Company\\Project\\Condition\\NotOnBranch",
                        "args": [
                            "master"
                        ]
                    }

I will steal that one for sure ;)

Regarding the {$ALL_CHANGED_FILES} placeholder. I'm not sure how useful it would be to include. What would you say to a placeholder {$BRANCH_START|compared-to:...}`?

Then you could use something like this

{
                "action": "git diff --name-only {$BRANCH_START|compared-to:master}..HEAD | grep 'CHANGELOG.md",
                "config": {
                    "label": "Check for 'CHANGELOG.md'"
                },
                "conditions": [
                    {
                        "exec": "\\Company\\Project\\Condition\\NotOnBranch",
                        "args": [
                            "master"
                        ]
                    }
                ]
}
sebastianfeldmann commented 8 months ago

I think this is what you need

git log --name-only --format='' master..HEAD | grep CHANGELOG.md

If you are sure that the master branch is the source of your branch you can use this. It will output all files changed by any commit that is in your current branch but not in the master branch.

If you create feature branches (2) from feature branches (1). Running this will within branch (2) will list all files from (1) and (2) so if anyone in (1) already changed the CHANGELOG.md you could get a false positive.

But this boils down to the same problem we faced with the PRE_PUSH stuff. If you don't provide a reference manually, in your case master determining it properly is only working with the reflog and that only works if you haven't fetched that branch from a remote.

Eydamos commented 8 months ago

I can not just compare to master. We have dozens of repositories so I made one repository that has a captainhook config with all the hooks. Some of those repositories have master some have main and some 2.x so I can not hardcode this.

BTW beside the NotOnBranch condition I also have a NotOnBranchByRegEx so I can compare it against /\d+\.x/ ;)

sebastianfeldmann commented 8 months ago

Me playing around with 5.20.* I run into more an more problems with the reflog solution.

If I push and then tag the repo and then just push the repo I get a zero hash for the tag because it does not exist on the remote.

This leads the Cap'n to use the reflog solution to figure out the changes. While working on the main branch this now returns basically all files of the repository :) Not sure that makes sense

I would have to check if the rev I'm receiving via stdIn is a tag or a commit. Maybe it would be best to introduce a new placeholder for the reflog stuff because now it causes more problems than it solves.

sebastianfeldmann commented 8 months ago

So here are my thoughts:

Ideas so far

{$BRANCH_CHANGES}, {$BRANCH_CHANGES|compared-to:master|of-type:md}

                "conditions": [
                    {
                        "exec": "\\Company\\Project\\Condition\\BranchChanges",
                        "args": [
                            {"compared-to": "master", "of-type": "md"}
                        ]
                    }
                ]

If the source branch is not specified I could try to find it using the reflog.

This would give you two options

  1. Push a branch immediately after creation to have a reference and use {$CHANGED_FILES} from there (my preferred solution)
  2. Use {$BRANCH_CHANGES} to be sure to test everything but accept, that during multiple pushes you check the same files multiple times

It would potentially slow down the hooks because it would run checks on all files changed in a branch on every push. But I think this is the best solution. What do you think?

Eydamos commented 8 months ago

I would go with option 2 {$BRANCH_CHANGES} as it is basically what I initially proposed and fits our needs and work style the most. I don't mind checking all the files on every push it is actually what I really want

Eydamos commented 8 months ago

BTW the issue you had with the tags is exactly why I have the condition NotOnBranch. Tags are only pushed from the master branch so I skip the check for this branch as the branch itself is protected any way so nobody could push actual changes to it

sebastianfeldmann commented 8 months ago

Ok, then I will rollback the {$CHANGED_FILES} behavior and introduce the new {$BRANCH_CHANGES} placeholder and Condition.

sebastianfeldmann commented 8 months ago

So i just released version 5.21.0

The new placeholder is called {$BRANCH_FILES}, "changes" was a bit vague and could also be a full diff.

So you should be able to do something like {$BRANCH_FILES|of-type:md}. If possible you should use the compared-to option but as you already stated that does not work for your use case. So if you don't provide that option the previous reflog solution is used to find the branching point of the current branch.

If that can't be found the placeholder is empty.

You can make sure to only execute the action if files can be detected by using the Branch\Files condition.

Eydamos commented 8 months ago

I get an exception:

Command failed:
  exit-code: 128
  message:   fatal: ambiguous argument '73cf48c..HEAD..': unknown revision or path not in the working tree.

The issue comes from calling Operator\Log::getChangedFilesSince() as $revision you give it $start . '..HEAD' but Log\Log::byRange() expects a $from and a $to param instead of a single string

sebastianfeldmann commented 8 months ago

ARGHHH, will fix in a sec stay tuned :)

sebastianfeldmann commented 8 months ago

Should be fixed in 5.21.2 released 30 sec ago :)

Eydamos commented 8 months ago

Works like a charm now. As usual thank you very much for the quick response and fix