scr34m / php-malware-scanner

Scans PHP files for malwares and known threats
GNU General Public License v3.0
556 stars 96 forks source link

Lots of tweaks #3

Closed nichogenius closed 7 years ago

nichogenius commented 7 years ago

-- Added a 3rd pattern file, patterns_iraw.txt. Exactly the same as patterns_raw.txt except it does a case insensitive search. Useful for finding phishers using words like phishtank etc

-- Added pattern comment and whitespace handling. We can now add comments and blank lines to the pattern files to make them easier to document and format. Started sorting but I don't know what many of the patterns are picking up.

-- Changed the output format. The md5 checksum now comes before the path as that column is fixed width... might not be as pretty if you use a narrow terminal window. Also added color to the md5 and the pattern matched to make it easier to pick out the individual parts. Added '#' characters into the output to make the output bash safe in the event of accidental paste dumps into the terminal as '#' is the comment symbol for bash. I'm still not completely happy with the output.

-- Created more flag options so there is a short and long form of each flag. ie --directory or -d, --ignore or -i, --hide-ok or -k.

-- Added verbose option which will continue scanning the same file even after a first match is found... useful for finding what patterns are the most common or additional datapoints to classify a suspicious file.

-- Added a python script called b64p.py which takes an input string and returns the 3 base64 string equivalents. Usage: ./b64p.py 'preg_match'

-- Added php_keywords.txt. This is a file containing a list of PHP 7 keywords and their base64 translations. It is formatted in such a way as to be able to be used as a pattern file as a replacement for patterns_raw.txt. Useful to scan known malware to find new patterns to look for.

-- Added php_functions.txt. Similar to php_keywords.txt, but is specific to php 7 functions. Very large and very slow to scan with, but it's a good reference file and useful to scan known malware with.

-- Sorted the Whitelist file first by path, then by md5 checksum

-- Modified the part of the code responsible for reading in pattern files. Read each pattern file in once instead of reading in each pattern file again each time a new .php file is scanned. Should boost performance marginally.

scr34m commented 7 years ago

Oh god :) lots of changes, will check on weekend, but I don't see any problem to accept, just to be sure, thanks!

nichogenius commented 7 years ago

As far as detriments, I think i only dropped about 3 patterns (from the original) that were causing too many false positives for my liking. Spammer, (chr(\d+).){4}, and one other I can't think of right now.

Most of the changes were adding functionality or just tidying some stuff up.

nichogenius commented 7 years ago

Probably best if you try out the changes before accepting... the output formatting might be a bit too much of an overhaul.