Yelp / detect-secrets

An enterprise friendly way of detecting and preventing secrets in code.
Apache License 2.0
3.77k stars 469 forks source link

Review and improve regex rules #159

Open domanchi opened 5 years ago

domanchi commented 5 years ago

There was a recent white paper released (summary, source).

What's most interesting is on page 15, they list a variety of explicit regexes that we may be able to incorporate into our scanning. I think we already cover like 80% (mostly with the high entropy scanner), but there are some interesting ones to extract from that. e.g.:

We should go through this list and create new plugins for the ones that we're missing.

killuazhu commented 5 years ago

I love the idea. Be able to more deterministically identify the type of the token can also support #153

domanchi commented 5 years ago

A couple of notes from this paper worth mentioning (for posterity):

KevinHock commented 5 years ago

I thought this part was another cool thing to experiment with:

Section III, Part D:

Note that each regular expression was prefixed with negative lookbehind (?<![\w]) and suffixed with negative lookahead (?![\w]) to ensure that no word characters appeared before or after the regular expression match and improve accuracy.