Closed nutjob4life closed 8 months ago
@nutjob4life - thank you for testing out our guide and for finding this bug! Appreciate you taking the time to report on this.
I just tested out a test scan on a repository using the three approaches below:
detect-secrets scan ./ --all-files --disable-plugin AbsolutePathDetectorExperimental --exclude-files '.secrets.*' --exclude-files '.git*' > .secrets.baseline.exclude-glob
detect-secrets scan ./ --all-files --disable-plugin AbsolutePathDetectorExperimental --exclude-files '\.secrets\..*' --exclude-files '\.git.*' > .secrets.baseline.exclude-regex
detect-secrets scan ./ --all-files --disable-plugin AbsolutePathDetectorExperimental > .secrets.baseline.no-exclude
I then ran the following command against each of the .secrets...
files above: cat .secrets.baseline | grep ".git"
. Here's what I saw:
.git
files.git
files.git
filesSo, it seems in terms of the guide, the pattern --exclude-files '.secrets.*' --exclude-files '.git*'
is functionally equivalent for my repository compared to --exclude-files '\.secrets\..*' --exclude-files '\.git.*'
. This is a special case most likely.
@nutjob4life - did you see have a use case where you tried to include other exclude-files
clauses and you had to use the RegEx approach to make it work? Also - if you have the time, would you be interested in proposing a PR for this fix in the guide? Would be great to see you have commit credit for your find!
@perryzjc - I think @nutjob4life's suggestion is spot on and we may need an update to the guide to reflect. Probably a clarification about glob vs regex. Thoughts?
Keeping glob patterns when the --help
and the implementation itself clearly uses regexes is only going to cause future confusion. I'm +1 for updating the guide.
Here's what I ended up using for Python repositories:
detect-secrets scan . \
--all-files \
--disable-plugin AbsolutePathDetectorExperimental \
--exclude-files '\.secrets..*' \
--exclude-files '\.git.*' \
--exclude-files '\.mypy_cache' \
--exclude-files '\.pytest_cache' \
--exclude-files '\.tox' \
--exclude-files '\.venv' \
--exclude-files 'venv' \
--exclude-files 'dist' \
--exclude-files 'build' \
--exclude-files '.*\.egg-info' > .secrets.baseline
And for Maven:
detect-secrets scan . \
--all-files \
--disable-plugin AbsolutePathDetectorExperimental \
--exclude-files '\.secrets..*' \
--exclude-files '\.git.*' \
--exclude-files 'target' > .secrets.baseline
Although I'm still not happy with either of these. For example, suppose you've got a Python repository with
📁 my.package
📁 src
📁 my
📁 package
🗒️ __init__.py
📁 accessory
📁 btree
📁 dist
📁 parser
or a Maven package with
📁 stem-analyzer
📁 src
📁 main
📁 java
📁 org
📁 forestry
📁 stem
📁 analysis
📁 chooser
📁 multiplexer
📁 target
📁 yamlparser
Then in the Python case, the module my.package.dist
gets ignored and in the Maven case the module org.forestry.stem.analysis.target
gets ignored.
On top of that, the command-line argument is --exclude-files
which suggests it's just files that gets excluded, when it's files and directories. But that's Yelp's fault 😝
@nutjob4life - added your contribution to this PR: https://github.com/NASA-AMMOS/slim/pull/143
Thanks again!
Wooot! Thanks @riverma! 🥂
Resolved with https://github.com/NASA-AMMOS/slim/pull/143
Checked for duplicates
Yes - I've already checked
Website or Best Practice Guide?
Best Practice Guide
Describe the bug
When I tried integrating
NASA-AMMOS/slim-detect-secrets
into a PDS Engineering Node project, I noticed that the instructions in this guide kept treating thedetect-secrets
command's--exclude-files
argument as a "glob" style file patterns:However, when I run
detect-secrets scan --help
, the usage says:which says it's a regex, not a glob.
So, diving into the code:
And sure enough
.search()
is a function on a regex object, not a shell glob.What did you expect?
I expected the docs to use regexes, not globs:
should become
The same should be applied to the
args
in the pre-commit config later in the docs.Reproducible steps
Environment