Closed tsellers-r7 closed 3 years ago
CC @hdm @pberry25 @dabdine
Seems great here! Any chance we can get a test case to scan new regexes for the incompatible variants? ([:digit:], non-leading ?i, etc).
@hdm - I'm leaning towards an effort to separate the fingerprint databases into their own repo and then have language/tool specific tests that ensure that we know that patterns are valid in the standard/common regex libs for Ruby, Go, Java, etc and also pass the build in tests vs examples.
See #138 for when tests for this can be accomplished :smiley:
Description
The intent of this PR is to improve cross language compatibility for the regex patterns used.
The changes:
Moves inline flags such as
(?i)
and(?m)
to the beginning of the regex. In Python these flags have global meaning and use of them anywhere other than the start of the pattern is deprecated. This does not change the function of these flags in current patterns for other languages.Replaces the use of POSIX character class indicators (
[[:alnum:]]
,[[:digit:]
, and[[:space:]
) with either the short backslash version (\d
,\s
) or a character range (a-zA-Z0-9
). The reason for this is that certain languages (Java, Python, Javascript) don't support the use of POSIX character classes.How Has This Been Tested?
rspec
, 3rd party regex testors.Types of changes
Fingerprint construction
Checklist: