Closed domwhewell-sage closed 2 months ago
Marking this for review now, unfortunately trufflehog is duplicating the discovered keys as technically they are within 2 different "lines" of the zip file probably something to think about in our trufflehog module if we could do an internal set()
to de-dupe discovered secrets.
Also the repo I used to test this on had an aws_access_key, aws_secret_access_key and aws_session_token in the workflow log and trufflehog wasn't picking it up so that's a bug I'll have to pickup with the developers of that tool.
Finally the "location" a discovered secret would be found is a run_XXXXXXX.zip
file which obviously will mean nothing to the user of bbot so we would need some way of linking this to the original CODE_REPOSITORY
event. (https://github.com/blacklanternsecurity/bbot/issues/1319 ?)
Theoretically: CODE_REPOSITORY
-> FILESYSTEM
-> FINDING
.
Nothing to change in this module but all things to think about for the trufflehog module changes required to make this module yield secrets
For now can we add a description
to the FILESYSTEM
event that says something like, "these are logs from the GitHub workflow <workflow>
on <repo>
at <time>
"?
Nice work on this! I made a small tweak to the error handling, let me know if it looks good.
I've made a modification to prevent the duplication as the downloaded zip archive contains a structure like
allsteps.txt
folder/
- step1.txt
- step2.txt
Therefore a secret could be in allsteps.txt
and step2.txt
which would make trufflehog raise the finding 2x for the same secret
This PR adds a new module to download workflow logs from a repository as mentioned in https://github.com/blacklanternsecurity/bbot/issues/1305.
It will always try all workflows in the repository and by default 1 successful log is downloaded for each and you can specify
num_logs
up to a maximum of 100 logs for each workflow.It raises
FILESYSTEM
events for the downloaded workflow logs archive.The plan is to run trufflehog against these archives but first I want to double check trufflehog runs against them without loads of duplicates (Unzipping the archive manually there's a large logfile and smaller logfile "chunks" that seem to duplicate the content of the largelog)
bbot -t blacklanternsecurity.com -m github_org, github_workflows --config modules.github_org.api_key=<api_token>