lelinhtinh / de4js

JavaScript Deobfuscator and Unpacker
https://lelinhtinh.github.io/de4js/
MIT License
1.29k stars 325 forks source link

Figure out variable names #53

Closed jimmywarting closed 3 years ago

jimmywarting commented 3 years ago

Short variable names ain't nice, JS nice uses a subset of variable names but isn't the right word

It would be cool if if it could be context aware and convert 1 length variable names to the same property name if they are used in a object.

Example input:

function hello(m) {
  var a = 'alert'
  return {
    action: a,
    message: m 
  }
}

output

function hello (message) {
  var action = 'alert'
  return {
    action: action,
    message: message
  }
}

Another example that can expand on function argument names (based on the above output)

function hello(m) {
  var a = 'alert'
  return {
    action: a,
    message: m 
  }
}

function init() {
  var m = 'world'
  hello(m)
}

outputs

function hello (message) {
  var action = 'alert'
  return {
    action: action,
    message: message
  }
}

function init() {
  var message = 'world'
  hello(message)
}

AST knows that hellos first argument name is message so it can use the same variable name

lelinhtinh commented 3 years ago

After finding the property name, need to check whether the new variable name is being used or not, handle duplication, ... What do you do with data like this:

let m = 'hello';
// something
m += ' world!';

const a = 'success', b = 'error';

const alertStore = [
  {
    type: a,
    message: m,
  },
  {
    type: b,
    message: m,
  },
  {
    type: b,
    message: m,
  },
  {
    type: b,
    message: m,
  },
  {
    type: a,
    message: m,
  },
];

It is a complicated job should be a separate project. I'm afraid my knowledge is not enough. By the way, this is the output from UglifyJS 3:

function hello(n) {
  return {
    action: 'alert',
    message: n,
  };
}

function init() {
  hello('world');
}

Ps: Oh, I see your nickname quite familiar, I have used StreamSaver.js in a few projects. :tada:

jimmywarting commented 3 years ago

I haven't looked into the in and out of your code to see how you de-fuck javascript. But I assume some AST (abstract syntax tree) is used to parse the code into plain objects?

for the case of duplication eg:

let m = 'hello'

const alertStore = [
  { message: m },
  { message: m }
];

then equivalent corresponding AST object would look something like this:

{
  scope: [
    { type: variable, using: 'let': key: 'm', value: {type: 'string', value: 'hello'} },
    { type: variable, using: 'const': key: 'alertStore', value: {type: 'array', value: [
      { type: 'object', keys: {
         message: __REFERENCE_TO_OBJECT_M__
      } },
      { type: 'object', keys: {
         message: __REFERENCE_TO_OBJECT_M__
      } }
    ] } },
  ]
}

(ps: long time ago since i had a look at AST but it's something like this)

for the case of { message: m } you won't have to actually replace code or string since it's just a object reference in AST to { type: variable, using: 'let': key: 'm', value: {type: 'string', value: 'hello'} } so you only have to change one object { type: 'variable', key: 'm', ...}.key = 'message' to replace the hole AST and it's output?

ofc there is also some concern about the scope also (i'm guessing)

Ps: Oh, I see your nickname quite familiar, I have used StreamSaver.js in a few projects. 🎉

Cool 😃

lelinhtinh commented 3 years ago

My code is a mess, take a look at lib/utils.js and lib/obfuscatorio.js. I mainly use regex to parse some strings needed for decoding, but not the entire source code to have the complete AST.

There are other projects that use AST like deobfuscator-io, I will probably still use my way for a while... to look different.

Ps: I'm looking at the project I linked above, here's what you said - update the variable name: https://github.com/sd-soleaio/deobfuscator-io/blob/b0be2cb04845d6a6d5f4bb8eed3f8574837fe141/src/deobfuscator.js#L154-L167