mitre / hipcheck

Automatically assess and score software repositories for supply chain risk.
https://mitre.github.io/hipcheck/
Apache License 2.0
62 stars 3 forks source link

Enable specifying a specific Git ref to use when analyzing a repository. #183

Closed alilleybrinker closed 1 month ago

alilleybrinker commented 2 months ago

Today, Hipcheck just analyzes a repository's "current" checked out ref. For a remote repository, this means the default branch of the repo (usually main or master, but it's really whatever the default is on the remote). For a local repository, that means whatever ref is currently checked out.

However, it is sometimes desirable to target a specific ref, for example the ref for a specific tagged version, rather than just "latest" on the current branch. To support this, we should add a --ref flag to the hc check CLI which allows specifying a Git ref. We may want to do some parsing here to provide nice error messages, but under the hood we would likely just be passing the ref to git checkout. See the Git docs for more information about refs: https://git-scm.com/book/en/v2/Git-Internals-Git-References

One thing to note is that, for remote repositories, this is easy. We already clone the repository locally, so if a ref is provided we can just add a step to checkout that ref before proceeding with analysis.

For a local repository it is slightly more complicated. If the repository is in a state where git checkout <ref> can work (meaning, there are no untracked changes which would be lost), then we could simply record the checkout that was present before, switch it to the one we want, then switch back when Hipcheck is done. This raises a challenge: if Hipcheck is killed before it can finish, we've silently left the target of analysis on a different checkout. Even if we try to install a signal handler for graceful shutdowns, hard shutdowns which bypass it are still possible.

If the repository is not in a state where Hipcheck can run git checkout, then we're in a tougher spot.

The "most correct" option to do in all cases would, if a ref is specified, be to clone the local repo into the cache directory, to avoid mutating the one already on disk. Then we can freely checkout other refs on our clone without worrying about leaving things in a bad state.

Landing this new --ref flag would also work with other future improvements, like supporting "VCS URLs" for specifying targets, which are permitted to embed a ref, or for trying to checkout a specific version tag when given a package version for the "package to source repository" target code path.

j-lanson commented 1 month ago
alilleybrinker commented 1 month ago

@j-lanson I believe we've achieved this, but want to confirm before closing.

alilleybrinker commented 1 month ago

Actually, going to close and we can open new issues for any specific further enhancements.