Open sebastian-fredriksson-bernholtz opened 1 month ago
Hi, and thanks for taking the time to report this request.
In the case of running on GitHub Actions, Super-linter uses the following function to initialize diffs:
Where:
DEFAULT_BRANCH
is set using the corresponding configuration variable (default: the repository default branch when running on GitHub Actions, master
otherwise)GITHUB_SHA
. GitHub Actions sets this by default, ref about values. Super-linter sets this to:
push
events: uses it as it is.pull_request
events (as you mentioned): gets this from the event that triggered the workflow as .pull_request.head.sha
instead of using the default Last merge commit on the GITHUB_REF branch
. This was initially implemented in #1305.GITHUB_BEFORE_SHA
(needed to initialize the diff in case push
events): essentially, the hash of the commit that the tip of the pushed branch pointed to before pushing, with some adjustments to account for corner cases.So, Super-linter doesn't need the whole repository history, but just the minimum amount of data to account for the above.
We suggest setting fetch-dept: 0
(i.e. fetch everything) for simplicity. Super-linter emits that message when either GITHUB_SHA
doesn't exist:
or when there's an error running the diff command:
To your points:
However, super-linter should be able to support just a shallow checkout on pull request and run a
git show
instead of agit diff
(if that is what you do?) to get all the files that have changed.
git show
is likely not meant for this, and would require some parsing + specifying a suitable output format. Also, it will probably have the same issue when you run it against a commit that you didn't fetch (I didn't test this). And, what about when you push more than one commit?
Rather, the problem seems to be that super-linter overrides the GITHUB_SHA with the pull request head reference instead of leaving it as the pull request merge reference. So if the pull request merge reference is the commit that has been checked out with with depth 1 (which is the default), the commit referenced by the pull request head reference doesn't exist, because it's one of the parent commits to the merge commit.
But you have a point here that we probably don't require the full git history. Also, we don't want to handle the fetch operations from within Super-linter because they can potentially modify the state of the local copy of your repository.
We'll be happy to review any PRs about this!
Is there an existing issue for this?
Current Behavior
Super-linter's documentation states that it needs the full history to determine what files have changed, and gives an error if not enough history has been fetched (see https://github.com/super-linter/super-linter/issues/5313).
Expected Behavior
For pull requests, super-linter should be able to just use a shallow checkout of the triggering commit to determine what files are being changed by the pr, since the triggering commit is the result of merging the pull request branch into the target branch. It is referenced in a hidden (but accessible) refs/pull/PULL_REQUEST_NUMBER/merge ref.
Anything else?
Sorry, I haven't really looked at the code, but I'm guessing that super-linter runs a
git diff
between the pull request branch and the target branch, because we got it to pass by fetching the history of the pull request branch and the default branch back to the point where their histories split (see https://github.com/super-linter/super-linter/issues/5313#issuecomment-2367022031)?We did this because we have quite a big monorepo, and have to carefully manage the amount of data we fetch, because we use self-hosted runners (because we ran out of github action minutes) and pay for the data we fetch.
However, super-linter should be able to support just a shallow checkout on pull request and run a
git show
instead of agit diff
(if that is what you do?) to get all the files that have changed.At first, I thought the
The GITHUB_SHA reference (...) doesn't exist in this Git repository
error described in https://github.com/super-linter/super-linter/issues/5313 was caused by the fact that the triggering commit is a hidden commit, that is not normally available (depending on how you checkout). However, when looking at the PR that was mentioned as having introduced the issue (#4889), I found that doesn't really seem to be the issue.Rather, the problem seems to be that super-linter overrides the GITHUB_SHA with the pull request head reference instead of leaving it as the pull request merge reference. So if the pull request merge reference is the commit that has been checked out with with depth 1 (which is the default), the commit referenced by the pull request head reference doesn't exist, because it's one of the parent commits to the merge commit.