Open gab12 opened 8 months ago
Can you please add some more detail on this? Which "well-known AI" was it and what code you gave it to de-obfuscate? I mean, maybe it can resolve small, simple pieces of code, but does it also manage to de-obfuscate complex programs with hundreds, or even thousands of lines?
And how did it know that, say, your obfuscated constant kvVIgcZOKDckqVxb was, say, LABEL_ORDERS? That's impossible, I dare say.
And how did it know that, say, your obfuscated constant kvVIgcZOKDckqVxb was, say, LABEL_ORDERS? That's impossible, I dare say.
It may be very well possible, if you forgot to remove yakpro-po comments. I always have a line like this in my workflow, at the end of the obfuscation process:
find obfuscated-code-dir/ -type f | grep -E '(\.php|\.css)' | xargs -P 5 -n 10 sed -i -f sed-script-yakpro-po-remove-comments
This finds all PHP/CSS files in the obfuscated-code-dir
directory and runs sed
on them in a parallel way (using xargs -P
) with the sed
script sed-script-yakpro-po-remove-comments
. I attach the sed
script for your convenience: sed-script-yakpro-po-remove-comments.txt
# NOTE: For this sed script to work, you must
# change config.php of yakpro-po to delimit its comments with
#
# /* |--------------------------------------------------|
#
# and
#
# |--------------------------------------------------| */
#
# instead of the original
#
# /* __________________________________________________
#
# and
#
# */
If you don't do the above and leave yakpro-po
comments in your obfuscated code, it is a matter of one command to de-obfuscate some name, with the help of the --whatis
option:
yakpro-po --config-file path-to/yakpro-po.cnf path-to/original-dir -o obfuscated-code-dir --whatis kvVIgcZOKDckqVxb
Of course, here you need the original , unobfuscated directory, but maybe someone wrote a script to bypass this and just de-obfuscate with the help of yakpro-po
comments - and that is what the AI tool might have used.
I am curious if your AI tool can de-obfuscate a program with comments removed as above.
For information, the AI I use is ChatGPT 3.5.
Just ask it to unobfuscate the code and it does so. It can happen that obfuscation makes lose the information of the name of the original variables. You therefore need to complete the request by specifying that the variables should also be made readable.
It will then understand the code and restore (or rewrite) the variables in a very clear and readable way. I've run several tests and the results are impressive!
Having spoken to other people who have tried it with other obfuscation tools, the result is impeccable. Personally, I use yakpro's default configuration, I haven't tried deleting comments if it doesn't do it by default. But I don't think that's a problem. I invite you to test it on your code :)
Note: I'm not comfortable explaining the deobfusquer method in detail. I'll edit the post in a few days to remove the explanation. The idea is to make people aware of the fact that obfuscation is no longer a method of securing one's code and can be broken in a few seconds by anyone, at least in my opinion... and I think that low-level ofuscation tools like zend guard should still resist, but I've never tested it.
low-level ofuscation tools like zend guard should still resist, but I've never tested it.
Well, no, not really. First of all, zend guard
uses encryption to encrypt the code. The problem is, PHP has somehow to decrypt the code to run it. Now, if your PHP engine can decrypt something, then you can too! All you need is to re-compile PHP, changing the source code at some point to print what it has decrypted, just before it executes it. That's how you will find services that will decrypt your code for a few $$.
Therefore, if we have a chance to protect PHP source code, obfuscation is the way to go, not encryption.
It will then understand the code and restore (or rewrite) the variables in a very clear and readable way.
Again, you are not being clear enough. To stay in my example above, does it change kvVIgcZOKDckqVxb
to your original name, say, LABEL_ORDERS
? Or does it change it to something simpler, say L
? It is easy to find all convoluted names in some code and change them to something simpler, that is: change kvVIgcZOKDckqVxb
to a one-letter constant L
. Then you have to understand the role of constant L in the code...Unless, ChatGPT
"understands" that L
is a constant that has to do with orders and uses a "meaningful" name, e.g. L_ORDERS
or something.
Plus: yakpro-po
can change the order of execution. I use it and the obfuscated code is full of goto
's! Example:
if (!(qbq1_k0qvpRaKW3_ && isset($_SESSION["\x44\x45\x42\125\107"]) && $_SESSION["\104\x45\102\125\x47"])) {
goto Ai0911ru0LZDUiLb;
}
L0AqwMuafHCO0g1G::TzvDpfbljJrnOMwE($QY71R44X2_TIBqJA);
Ai0911ru0LZDUiLb:
if (!(T_huakxp5uXdH4Ab && isset($_SESSION["\x44\105\102\125\107"]) && $_SESSION["\x44\x45\x42\125\x47"] && $dyILjT9pOSH28I62 != array())) {
goto TBSd5sIHxW5iIVlb;
}
Works exactly as the original, but good luck understanding the logic! :-)
P.S. Feel free to post the unobfuscated version by ChatGPT
of the above - it is not anything special. Just curious. I am too busy to do it myself right now...
to be very honest :
What cannot forever be reverted because information is lost :
It seems enough for me to say that obfuscation is the best way to protect code... for big projects with hundreds of source code files, it will make it quite impossible for someone to understand the code
Any chance this will be updated so it can be used with modern php scripts? php-8 is here to stay, unfortunately
The new GPT4o model (maybe it would also work with gpt3) works AWFULLY GOOD in resolving gotos, at least in small code snippets. https://chatgpt.com/share/2e891b5c-6ad0-46a8-984b-4bbb414b7a17 (From a file from a learning platform developed for a friend of mine whose name is Emmanuel). It's... yeah. The AI, a machines, makes humans understand stuff that should only be understandable by machines.
Hello,
I've been using yak for years and love the work that's been done so far. Well done. However, IT is evolving and so are the tools. The AI revolution is now here.
I did a test with a well-known AI, and passed some code obfuscated by yak to the AI. It gave me back the original code in a matter of seconds.
I think that with AI, high-level obfuscation is unfortunately obsolete, because I can't see any system that can defeat AI.
What do you think?