mckinsey / vizro

Vizro is a toolkit for creating modular data visualization applications.
https://vizro.readthedocs.io/en/stable/
Apache License 2.0
2.46k stars 109 forks source link

[Bug] Fix `hatch run secrets` command #427

Closed maxschulz-COL closed 2 months ago

maxschulz-COL commented 2 months ago

Description

This PR closes https://github.com/McK-Internal/vizro-internal/issues/607. It is not necessary to have a baseline, because we have no secret in our public commit history. However, our command hatch run secrets did not work anymore, so it was updated.

Details on learnings

More to remember later than anything else... This also explains why the output of hatch run secrets was different for different users.

gitleaks operates by scanning the diffs from git log -p, which does the following

git log shows the commit log accessible from the refs (heads, tags, remotes).

  • this is the public record which will be included in pushs and fetches
  • if a secret is found, then it is in principle publicly accessible
  • deleting old outdated branches that are not present in the remote repo (e.g. via git fetch -a -p) can thus remove secrets that gitleaks may find locally

Can secrets hide somewhere else?

Yes and no. So e.g. we saw that while gitleaks didn't find the secret anymore (after deleting stale branches), it was still possible to checkout at the problematic commit (as we knew the sha)! This is because of git reflog:

git reflog is a record of all commits that are or were referenced in your repo at any time.

  • this is the reason why we could still checkout the commit
  • git reflog doesn't traverse HEAD's ancestry at all. The reflog is an ordered list of the commits that HEAD has pointed to: it's undo history for your repo. The reflog isn't part of the repo itself (it's stored separately to the commits themselves) and isn't included in pushes, fetches or clones; it's purely local.

  • when cloning the repo anew, and then searching for the problematic commit, it will not be there

Conclusion

Screenshot

Notice

maxschulz-COL commented 2 months ago

I don't understand it entirely 😅 But I think the rationale for storing the baseline file (even if it's an empty list only) was for the pentest? We had to generate it from scratch and then the question popped up if we could just store it in the repository.

@Joseph-Perkins Can you help out here?