mindedsecurity / JStillery

Advanced JavaScript Deobfuscation via Partial Evaluation
GNU General Public License v3.0
856 stars 143 forks source link

Some opaque predicates #6

Open mikesamuel opened 6 years ago

mikesamuel commented 6 years ago

I don't know if opaque predicates are in bound, but in the REPL,

const a = !!(Math.random() < 2)
const b = !!(new Date().getYear())
const c = !!Date.now()
const d = !((+c/0) == (+c/0))

deobfuscates to

const a = !!(Math.random() < 2);
const b = !!new Date().getYear();
const c = !!Date.now();
const d = !(+!!Date.now() / 0 == +!!Date.now() / 0);
wisec commented 6 years ago

@mikesamuel Date() is the only one atm that returns the string. I thiml it would be ok to get their value optionally. At least for starter. What do you think ?

mikesamuel commented 6 years ago

@mikesamuel Date() is the only one atm that returns the string.

I'm not sure that we're on the same page.

These predicates can be used to prevent constant folding. For example,

const name = 'constructor'
const chars = name.split('').map((c) => c.charCodeAt(0))
const launderedChars = chars.map((c) => String.fromCharCode(c + +!Date.now()))
const launderedName = launderedChars.join('')
// Here's what an attacker might want to get at without being obvious.
const f = {}[name][name]
const fByAnyOtherName = {}[launderedName][launderedName]

deobfuscates to

const name = 'constructor';
const chars = [
    'c',
    'o',
    'n',
    's',
    't',
    'r',
    'u',
    'c',
    't',
    'o',
    'r'
].map(c => c.charCodeAt(0));
const launderedChars = chars.map(c => String.fromCharCode(c + +!Date.now()));
const launderedName = launderedChars.join('');
const f = Function;
const fByAnyOtherName = {}[launderedName][launderedName];

Date.now() and (new Date()).getFullYear() could theoretically be 0 but an attacker can rely on it not being.

By spec, Math.random() occurs in [0, 1), so an attacker can rely on it being < 2.

I expect it's easy to find boolean expressions that always or almost always have the same value. I just didn't know whether it was worth collecting them or whether you were planning on trying to compensate.

Fyi, @jasvir

jasvir commented 6 years ago

Surprisingly easy to generate opaque predicates and difficult to solve them generally.

If you take it on, the most effective I've seen was to treat functions like Math.random()/Date.now()/Object.keys() as if they're returning a const value but have them return a representative value.

mikesamuel commented 6 years ago

@jasvir, Math.random() !== Math.random() is fairly reliably true but fails the representative value test.

I suppose your symbolic execution could treat it as a spread value, but you'd still have nonlocal workarounds like the below, where you need to recognize that the same specific instance is used in different places:

let a = []
let n = (100 * Math.random()) >> 0
let o = { length: n }
Object.assign(a, o)
a.push(true)
let p = a[n]
wisec commented 6 years ago

Hi @mikesamuel, @jasvir I also meant there should be a strategy in following opaque predicates. 3 options come to my mind (my 2.cents):

  1. The easier is to simply give the possibility to substitute the actual values to opaque functions. That can be repeated over and over to look how the script behaves and the choice might be specific to each function.
  2. The second one would be to add a "tainting" flag and propagate it over the results so it would be easier to analyze the opaque parts.
  3. The third and more complex, it does not substitute values but tags the dependent branches/operation/variable to get the different paths. After that, further analysis might be performed to identify/prune and show different execution paths.

    Do some of these ideas make sense to you guys?

mikesamuel commented 6 years ago
  1. The easier is to simply give the possibility to substitute the actual values to opaque functions. That can be repeated over and over to look how the script behaves and the choice might be specific to each function.

This would probably work for uses of the ! operator as above where you only need to make a gross distinction between the small number of falsey values and non-falsey values.

  1. The second one would be to add a "tainting" flag and propagate it over the results so it would be easier to analyze the opaque parts.

So this would be metadata that later, custom passes might use?

  1. The third and more complex, it does not substitute values but tags the dependent branches/operation/variable to get the different paths. After that, further analysis might be performed to identify/prune and show different execution paths.

This sounds like it would also help with gross distinctions like the falsey/truthy ones above.


My sense is that these techniques might allow you to analyze some programs in the wild which you can't analyze now, but that you're going to be in a cat&mouse game.

Most of the examples I gave depended on ! and its behavior with respect to truthy/falsey values.

How would they scale to non-binary correspondences like the below which depends on a relationship between a function argument and a property of the return value mediated by a host object for which we have only informal semantics?

function opaqueTrue(s) {
  try {
    return s.toLocaleLowerCase() === document.createElement(s).localName
  } catch (e) {
    return opaqueTrue('a')
  }
}

const nonSelfEvidentTruths = [
    opaqueTrue('b'), opaqueTrue('i'), opaqueTrue('q'), opaqueTrue('s'), opaqueTrue('u'),
    opaqueTrue('B'),/*opaqueTrue('I'),*/ opaqueTrue('Q'), opaqueTrue('S'), opaqueTrue('U'),
]

const notTrueInTurkishBrowser = opaqueTrue('I')