Flash logic error in bytecode verifier

GoogleCodeExporter commented 9 years ago

avmplus (an implementation of AVM2: 
http://www.adobe.com/content/dam/Adobe/en/devnet/actionscript/articles/avm2overv
iew.pdf) is the virtual machine used in adobe flash for running actionscript 3 
code. There's a pretty recent source code dump at 
https://github.com/adobe-flash/avmplus which I used for this research.

Avmplus has untrusted bytecode. It has a verifier which has two roles: firstly, 
it tries ensure that all instructions are safe; neither the interpreter nor the 
jitted code check stack accesses for example so it's up to the verifier to 
statically prove that all these accesses are safe. Secondly, using the same 
static analysis technique the verifier collects type information for the 
optimizing jit compiler. If the verifier is able to prove for example that at a 
certain program point a certain variable can only be an Integer then the jit 
codegen is able to use that information to potentially generate faster code by 
omitting a type check.

Specifically avmplus uses abstract interpretation to try to explore every path 
in a method. At each point in the method type information is stored for each 
local and stack variable as well as the current stack depth. When two paths 
merge, as well as safely merging any conflicting types it's also required that 
the statically computed stack depth be equal (so you can't push in a loop for 
example.)

For this to be safe it's critical that the abstract interpreter is able to find 
every possible execution path in the method - if you can execute a path at 
runtime which the verifier didn't find then it means that for example the jit 
may have emitted code for which the preconditions no longer hold. This is 
generally very bad :) Haifei Li has a great talk about CVE-2011-0609 which was 
an example of this type of bug ( 
http://recon.cx/2012/schedule/attachments/43_Inside_AVM_REcon2012.pdf .)

The two important functions for bytecode verification are checkTarget and 
verifyBlock. checkTarget signals that there is a possible control-flow edge 
between two addresses and verifyBlock is responsible for verifying a linear 
block of avm2 bytecode. There are two types of control flow possible in AVM2 - 
either explicit (instructions like iftrue, lookupswitch etc) or via throwing an 
exception. Here's the snippet of code responsible for handling exception-based 
control flow:

  if (pc < tryTo && pc >= tryFrom &&
      (opcodeInfo[opcode].canThrow || (isLoopHeader && pc == start_pos))) {
      // If this instruction can throw exceptions, treat it as an edge to
      // each in-scope catch handler.  The instruction can throw exceptions
      // if canThrow = true, or if this is the target of a backedge, where
      // the implicit interrupt check can throw an exception.
      for (int i=0, n=exTable->exception_count; i < n; i++) {
          ExceptionHandler* handler = &exTable->exceptions[i];
          if (pc >= code_pos + handler->from && pc < code_pos + handler->to) {
              ...
              const uint8_t* target = code_pos + handler->target;
              ...
              checkTarget(pc, target, /*isExceptionEdge*/true);

The comment in the code explains pretty clearly what's going on. If this 
instruction could throw an exception and it's within the bounds of a try block 
add an edge (by calling checkTarget) to the catch handler. Whether or not an 
instruction could throw an exception is stored in the opcodeInfo array which is 
generated from the macros in opcodes.tbl :

  //              opCount  throw  stack  internal  name                   hex     
  ABC_UNUSED_OP(  -1,      0,     0,     0,        0x00)                  // 0x00 
  ABC_OP(         0,       0,     0,     0,        bkpt)                  // 0x01 
  ABC_OP(         0,       0,     0,     0,        nop)                   // 0x02 
  ABC_OP(         0,       1,     -1,    0,        throw)                 // 0x03
  ...and so on for every opcode

I looked through this list of opcodes for instructions which aren't marked as 
"throw" to see if any could, in fact, actually throw an exception. Let's take a 
closer look at this group of instructions which are marked as not throwing:

  ABC_OP(         0,       0,     0,     0,        sxi1)                  // 0x50 
  ABC_OP(         0,       0,     0,     0,        sxi8)                  // 0x51 
  ABC_OP(         0,       0,     0,     0,        sxi16)                 // 0x52

Here's the code from verifyBlock for these instructions:

  // sign extends
  case OP_sxi1:
  case OP_sxi8:
  case OP_sxi16:
      checkStack(1,1);
      emitCoerce(INT_TYPE, sp);
      coder->write(state, pc, opcode);
      state->pop_push(1, INT_TYPE);
      break;

And here's the code from Interpreter.cpp which makes it clear what the 
instructions do:

  INSTR(sxi1) {
      i1 = AvmCore::integer(sp[0]);
      sp[0] = CHECK_INT_ATOM(Atom(((i1 << (8*sizeof(Atom)-1)) >> ((8*sizeof(Atom)-1)-3)) | kIntptrType));
      NEXT;
  }
  ... similar code for sxi8 and sxi16

These three instructions sign extend 1, 8 or 16-bit integers. The verifier only 
checks that there's at least one argument on the stack, it doesn't check the 
type. Indeed, looking at the interpreter implementation of the instruction it 
calls AvmCore::integer(sp[0]) to first convert whatever Atom is on the top of 
the stack to an integer before performing the sign extension and converting 
that value back to an Atom.

(Briefly, an Atom in avmplus is a tagged-pointer, akin to a JSValue in a 
javascript engine. Depending on the value of the tag bits it can either be a 
concrete value (like an integer) or a pointer to a more complicated object like 
a string.)

Given that the verifier allows any type to be passed as the argument to sxi*, 
what does the AvmCore::integer() function do to convert non-integers to 
integers?

  /*static*/ int32_t AvmCore::integer(Atom atom)
  {
      const int kind = atomKind(atom);
      if (kind == kIntptrType)
      {
          AvmAssert(int32_t(atomGetIntptr(atom)) == (int32_t)integer_d(number(atom)));
          return int32_t(atomGetIntptr(atom));
      }
      else if (kind == kBooleanType)
      {
          return int32_t(atom>>3);
      }
      else
      {
          // TODO optimize the code below.
          return (int32_t)integer_d(number(atom));
      }
  }

So for the types which aren't already pretty much an integer it passes the atom 
to ::number:

/*static*/ double AvmCore::number(Atom atom)
{
...
      const int kind = atomKind(atom);

      // kIntptrType is by far the most common
      if (kind == kIntptrType)
          return (double) atomGetIntptr(atom);

      ...

      // all other cases are relatively rare
      switch (kind)
      {

      ...

      case kObjectType:
          atom = AvmCore::atomToScriptObject(atom)->defaultValue();
          break;  // continue loop, effectively a tailcall
      }
  }

The last case is the most interesting one - if we've passed an object which 
isn't any of the other types then this will call defaultValue on that object:

  Atom ScriptObject::defaultValue()
  {
      AvmCore *core = this->core();
      Toplevel* toplevel = this->toplevel();

      Atom atomv_out[1];

      Multiname tempname(core->findPublicNamespace(), core->kvalueOf);
      atomv_out[0] = atom();
      Atom result = toplevel->callproperty(atom(), &tempname, 0, atomv_out, vtable);
  ...

This is calling back into bytecode! Specifically, if we define an object with a 
valueOf method then we can run our own code when that object is converted to an 
integer. And if that code throws an exception and doesn't catch it then it will 
be propagated up the call chain meaning that the sxi* instructions actually 
*can* throw exceptions if passed the right object :)

How to exploit this?

We need to set up two things: firstly, we need to be able to pass an object 
which overrides valueOf (and throws an exception) to an sxi* instruction. 
Interestingly the mxmlc compiler throws an error if you "import 
avm2.intrinsics.memory.sxi1" and try to pass such an object to the sxi1 
intrinsic - no worry, we have a hexeditor and this certainly isn't enforced on 
the bytecode level as we saw.

Secondly we have to set things up such that the jit will actually emit 
dangerous code due to not having explored the code path which goes via the 
exception. This is quite important - there isn't a vulnerability unless you can 
actually do something dangerous on the unexplored path:

  (this is a psuedo-disassembly of the bytecode, in reality I wrote most of the attached PoC in actionscript and edited it a hex editor to introduce the conditions to hit the bug)

  // this function is passed an object which overrides valueOf and throws and exception as well as a string object
  function(override_object, a_string) // arguments are passed in the "local" registers: local0 is the "this" object, local1 is the first argument and so on:

    pushint 0x414243
    setlocal3
    getlocal1 (first argument, override_object)

    try {
      sxi1
      getlocal2 (second argument, a_string)
      setlocal3
      throw
    } catch {
      getlocal3
      sxi1
      pop
      returnvoid
    }

    returnvoid
  }

What's going on here? The verifier is going to try to prove the types of 
variables which can be in each local variable at each program point. The first 
two instructions set local3 to hold the integer 0x414243. We then get the first 
argument to the function (passed in local1) and pass that to the sxi1 
instruction. The verifier believes that the sxi1 instruction cannot throw an 
exception, therefore it won't call checkTarget(this sxi instruction, the catch 
block). Instead the verifier thinks that the code can *only* continue to 
execute the next three instructions which set local3 to contain the string 
passed as the second argument. What this means is that when the verifier 
verifies the catch block, it believes that the only path to the entry point of 
the catch block is the preceeding throw instruction. However, we know that 
there's another path here - namely via the exception which the sxi1 instruction 
will throw. The important difference between these two paths is the value of 
local3. The verifier only sees the path where local3 is set to a string after 
the sxi instruction - it therefore believes that it has proved that only a 
string value can end up in local3 in the catch block. However, before the sxi 
instruction local3 is actually an integer.

What this means is that the verifier will tell the jit codegen that it's safe 
to assume that local3 in the catch block will *always* be a string, therefore 
the the jit will generate optimized code assuming that local3 is a string. Then 
what? The next instruction is another sxi1 - this is just there to demonstrate 
one possible way to exploit this jit type confusion since the jitted code for 
sxi1 will first convert the argument to a integer - except this time it knows 
that the argument is a string (when in fact it is an integer.) And is just so 
happens that converting a string to an integer involves calling a function 
pointer, so this particular example will crash treating our controlled integer 
(0x414243) as a pointer to an object from which a function pointer will be read 
:)

I've attached two PoCs:
jit_confused.swf has been tested against the latest version of flash projector 
for os x ( 15.0.0.152 )
jit_confused.abc is an abc file for avmshell which I used during research

Original issue reported on code.google.com by ianb...@google.com on 15 Sep 2014 at 4:52

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by ianb...@google.com on 15 Sep 2014 at 4:57

Added labels: ReportedOn-2014-September-15

GoogleCodeExporter commented 9 years ago

Original comment by ianb...@google.com on 15 Sep 2014 at 6:11

Added labels: Id-3036

GoogleCodeExporter commented 9 years ago

Original comment by cev...@google.com on 8 Nov 2014 at 2:34

Added labels: CVE-2014-0590

GoogleCodeExporter commented 9 years ago

Original comment by cev...@google.com on 8 Nov 2014 at 2:34

Added labels: ReportedOn-2014-Sep-15
Removed labels: ReportedOn-2014-September-15

GoogleCodeExporter commented 9 years ago

Original comment by cev...@google.com on 20 Nov 2014 at 12:53

Changed state: Fixed
Removed labels: Restrict-View-Commit

GoogleCodeExporter commented 9 years ago

http://helpx.adobe.com/security/products/flash-player/apsb14-24.html

Original comment by cev...@google.com on 20 Nov 2014 at 1:21

Added labels: Fixed-2014-Nov-11

GoogleCodeExporter commented 9 years ago

I have a question:

E:\AVM\xxx>java -jar asc.jar -import playerglobal.abc -import global.abc 
hello.as

hello.abc, 489 bytes written

E:\AVM\xxx>avm.exe hello.abc
VerifyError: Error #1014

the contents of hello.as:
 package {
    import flash.display.Sprite;
    public class Hello extends Sprite {
          public function Hello() {
               trace("Hello");
          }
    }
}

why? thanks!

Original comment by caozhihua325 on 19 Jan 2015 at 10:01

telnetgmike / google-security-research

Flash logic error in bytecode verifier #106