scr34m / php-malware-scanner

Scans PHP files for malwares and known threats
GNU General Public License v3.0
556 stars 96 forks source link

Added some base64 samples and an entry to the whitelist #2

Closed nichogenius closed 7 years ago

nichogenius commented 7 years ago

It looks like you have several strings of base64 text as samples... the problem with doing that is that each input string run through a base64 encoder can give 3 unique outputs depending on where the first character of your target match was during the encoding phase.

To illustrate, try the following: base64_encode("base64_decode"); // --> YmFzZTY0X2RlY29kZQ== //'b' at index 0 base64_encode(" base64_decode"); // --> IGJhc2U2NF9kZWNvZGU= //'b' at index 1 base64_encode("( base64_decode"); // --> KCBiYXNlNjRfZGVjb2Rl //'b' at index 2 base64_encode("l( base64_decode"); // --> bCggYmFzZTY0X2RlY29kZQ== //'b' at index 3 - repeat of 0

//Notice how the 1st and 4th lines are mostly the same as we would expect, but the 2nd and 3rd lines are different from the others.

Even though the plain text is very similar, the base64 output is drastically different. This is due to the 8 bit to 6 bit character size conversion. Any ascii string will generate 3 unique base64 strings. Accounting for these should greatly enhance the ability to detect obfuscated PHP code with base64 samples.

My updates replicated existing base64 examples with different offsets. I also trimmed the existing samples to match the maximum safe sample as edge characters can be modified by the preceding and following characters.

scr34m commented 7 years ago

Already got some idea about the "replace" instead of "preg_replace", but mostly the preg_replace was used in the samples so I didn't take this step, we will see if this approach is better or not now :)

Thanks for the the work