Closed nichogenius closed 7 years ago
Already got some idea about the "replace" instead of "preg_replace", but mostly the preg_replace was used in the samples so I didn't take this step, we will see if this approach is better or not now :)
Thanks for the the work
It looks like you have several strings of base64 text as samples... the problem with doing that is that each input string run through a base64 encoder can give 3 unique outputs depending on where the first character of your target match was during the encoding phase.
To illustrate, try the following: base64_encode("base64_decode"); // --> YmFzZTY0X2RlY29kZQ== //'b' at index 0 base64_encode(" base64_decode"); // --> IGJhc2U2NF9kZWNvZGU= //'b' at index 1 base64_encode("( base64_decode"); // --> KCBiYXNlNjRfZGVjb2Rl //'b' at index 2 base64_encode("l( base64_decode"); // --> bCggYmFzZTY0X2RlY29kZQ== //'b' at index 3 - repeat of 0
//Notice how the 1st and 4th lines are mostly the same as we would expect, but the 2nd and 3rd lines are different from the others.
Even though the plain text is very similar, the base64 output is drastically different. This is due to the 8 bit to 6 bit character size conversion. Any ascii string will generate 3 unique base64 strings. Accounting for these should greatly enhance the ability to detect obfuscated PHP code with base64 samples.
My updates replicated existing base64 examples with different offsets. I also trimmed the existing samples to match the maximum safe sample as edge characters can be modified by the preceding and following characters.