Closed timwsuqld closed 4 months ago
I would say this is applicable to only HTML files and maybe JS and CSS files. We don't really do anything special handling for these file types at the moment. I would be hesitant in whitelisting this for everyone for all file types.
As a side note, #310 might enable a whitelist like this.
Would it be possible to write a whitelist for the hashtype-hash pattern? sha256-hash
I'm not sure how hard it would be to write code to detect these strings, and then verify if the hash is the right size to match the hash type, so we can ignore them? (Assuming that we agree a hash isn't a secret?)
I agree hashes aren't secrets. The challenge with whitelisting hashes is that they look a lot like base64 or hex secrets. I agree that the sha256-hash
string is a good indicator that the string isn't a secret, but I'm not confident enough to say that this is the case for every single codebase.
One way we could implement this is to have a default --word-list
packaged with detect-secrets
and include sha256-hash
in that word list. We don't have a default --word-list
though.
Another way we could do this whitelisting is to implement an HTML file parser and make our parser discard the integreity
properties. This seems a bit heavy-handed, but could HTML secret scanning accuracy? We don't use many raw HTML files at Yelp so it's hard to judge detect-secrets
's existing HTML accuracy.
Is there a particular reason that passing that line in every time is annoying?
We're going to close this issue as it hasn't received any update in a very long time. Feel free to re-open it if you think it's still relevant.
link/script tags with an integrity value are detected as secrets. We should be able to easily filter these out as they have a fixed format, including the hash type.
I've used
--exclude-lines 'integrity="sha256'
for now to filter them out, but it may be nice to have that as a default somewhere? (Whitelist in the Base64 High Entropy String plugin?)