Closed nichogenius closed 7 years ago
Oh god :) lots of changes, will check on weekend, but I don't see any problem to accept, just to be sure, thanks!
As far as detriments, I think i only dropped about 3 patterns (from the original) that were causing too many false positives for my liking. Spammer, (chr(\d+).){4}, and one other I can't think of right now.
Most of the changes were adding functionality or just tidying some stuff up.
Probably best if you try out the changes before accepting... the output formatting might be a bit too much of an overhaul.
-- Added a 3rd pattern file, patterns_iraw.txt. Exactly the same as patterns_raw.txt except it does a case insensitive search. Useful for finding phishers using words like phishtank etc
-- Added pattern comment and whitespace handling. We can now add comments and blank lines to the pattern files to make them easier to document and format. Started sorting but I don't know what many of the patterns are picking up.
-- Changed the output format. The md5 checksum now comes before the path as that column is fixed width... might not be as pretty if you use a narrow terminal window. Also added color to the md5 and the pattern matched to make it easier to pick out the individual parts. Added '#' characters into the output to make the output bash safe in the event of accidental paste dumps into the terminal as '#' is the comment symbol for bash. I'm still not completely happy with the output.
-- Created more flag options so there is a short and long form of each flag. ie --directory or -d, --ignore or -i, --hide-ok or -k.
-- Added verbose option which will continue scanning the same file even after a first match is found... useful for finding what patterns are the most common or additional datapoints to classify a suspicious file.
-- Added a python script called b64p.py which takes an input string and returns the 3 base64 string equivalents. Usage: ./b64p.py 'preg_match'
-- Added php_keywords.txt. This is a file containing a list of PHP 7 keywords and their base64 translations. It is formatted in such a way as to be able to be used as a pattern file as a replacement for patterns_raw.txt. Useful to scan known malware to find new patterns to look for.
-- Added php_functions.txt. Similar to php_keywords.txt, but is specific to php 7 functions. Very large and very slow to scan with, but it's a good reference file and useful to scan known malware with.
-- Sorted the Whitelist file first by path, then by md5 checksum
-- Modified the part of the code responsible for reading in pattern files. Read each pattern file in once instead of reading in each pattern file again each time a new .php file is scanned. Should boost performance marginally.