prosyslab-classroom / cs348-information-security

61 stars 10 forks source link

[HackGPT] Reverse-engineering to webpacked page #365

Open TriangleYJ opened 1 year ago

TriangleYJ commented 1 year ago
image image

Name: 주예준

Description (up to 10 sentences)

https://chat.openai.com/share/14941dfc-f648-4521-84af-a2b13e457526

I gave the code to encrypt and decrypt on iam2.kaist.ac.kr as input of ChatGPT. This code is very difficult for people to read because it is compiled by webpack. However, ChatGPT completed reverse-engineering in a short time.

There are some points to note.

  1. ChatGPT inferred from the code that the encryption and decryption algorithms used this library, even though the first code given had no text called CryptoJS at all. This is obviously something Google can't do.
  2. ChatGPT has refactored code from encrypted code to human readable code. Likewise, this is a very long operation, and It produced exploitable code in a short time (constructable via fetch).
  3. ChatGPT correctly explained how IV and public key are constructed.
tanapthetimid commented 1 year ago

I don't think webpack was designed to be an obfuscation tool. There are obfuscation plugins for webpack but I don't think they were used here. The library CryptoJS is also one of the most widely used encryption libraries in js, so chatgpt guessing cryptojs doesn't seem very significant to me. I think the core of the issue here is that there are better ways to make a website than storing secrets in the js file. It may be more interesting to see if ChatGPT can reverse engineer more sophisticatedly obfuscated DRM software and anti-cheat software in video games and other applications.

TriangleYJ commented 1 year ago

I agree your idea, but I've never used the CryptoJS library at all, so this guessing comes to me quite interesting. (But I just figured out this is one of the wildly used library...) For a simple test, I just test with the result from famous and funny JS obfuscation in here, but It says It is difficult to determine its purpose or functionality without further context or explanation.. I think reverse-engineering a software from well-made DRM will return the same result. Also, I think this is not a fundamental threat too. However, even someone who is not an expert, He or she can easily try various reverse-engineering using ChatGPT for simple program like me. I think ChatGPT could be a threat in that it has lowered the barriers to reverse engineering.

KihongHeo commented 1 year ago

Hi. I am happy to see the discussions between the top hackers in the class.

Let me clarify one thing. Even though the example involves CryptoJS, it seems that ChatGPT does "deobfuscation", not "decryption". Am I correct?

tanapthetimid commented 1 year ago

@KihongHeo I believe so. ChatGPT was clarifying that the encryption functions may have come from CryptoJS. @TriangleYJ That's very interesting. Then I wonder if we could fine tune LLMs further to de-obfuscate even more complex stuff. Maybe one day we'll have a PS4 emulator and I'll be able to play bloodborne on pc :\

TriangleYJ commented 1 year ago

Yes, it does deobfuscution, not decryption. The purpose of this deobfuscution is to send any malicious request to iam server. To send requests, all payloads should be encrypted in client side, so I need to figure out how it encrypts the data.

@TanapTheTimid I think it may be possible via customed LLM, by using all source codes and corresponing binary (obfuscuted) file in the internet. GPT-4 or future version of GPT may be able to do this :)