Closed GoogleCodeExporter closed 8 years ago
Due to the way "length" is build, it can only be positive.
Now if it wraps around 32-bits (larger than (2<<31)-1), it could become
negative.
Maybe by making it "size_t" instead of "int" it would reduce the issue.
Now, as a practical attacks, it seems quite improbable : "op" is unknown to the
potential attacker, it could have any value. It's therefore completely
improbable that it can generate a content which creates a length of exactly
"-op" value (after wrapping beyond MAX_INT).
Original comment by yann.col...@gmail.com
on 6 Dec 2012 at 1:27
It does not have to be exactly equal to -op.
For example if you make an input file that contains 9000000 0xFF and then some
zeros, 'length' will wrap and will become a negative value, which can cause
thngs to crash.
When I tried yesterday I could easily make it crash by feeding it millions of
0xff.
Original comment by strig...@gmail.com
on 6 Dec 2012 at 1:31
OK, so if this apply to an extended range of values, not "only -op", then it's
a practical attack. I'll look into it.
Original comment by yann.col...@gmail.com
on 6 Dec 2012 at 1:33
[deleted comment]
[deleted comment]
[deleted comment]
This crashes for me on windows 32-bit mode:
#include "lz4.h"
char output[20<<20];
char input[20<<20];
int main(int argc, char* argv[]) {
input[0] = 0x0F;
input[1] = 0x00;
input[2] = 0x00;
for(int i = 3; i < 16800000; i++)
input[i] = 0xff;
LZ4_uncompress(input, output, 20<<20);
return 0;
}
Original comment by strig...@gmail.com
on 6 Dec 2012 at 1:58
Thanks for the sample. I'll look into it.
Original comment by yann.col...@gmail.com
on 6 Dec 2012 at 2:00
This program also crashes once in a while, depending on ASLR (Address Space
Layout Randomization)
#include "lz4.h"
#include <stdlib.h>
#include <string.h>
char input[20<<20];
char output[20<<20];
int main(int argc, char* argv[]) {
for(int j = 0; j < 10; j++) {
int n = 16700000 + j * 50000;
char *p = input;
*p++ = 0xF0;
memset(p, 0xff, n); p += n;
*p++ = 0x0;
LZ4_uncompress(input, output, 20<<20);
}
return 0;
}
Original comment by strig...@gmail.com
on 6 Dec 2012 at 2:20
Here's a possible fix for the first half of this bug:
if ((length=(token>>ML_BITS)) == RUN_MASK) {
len = *ip++, length += len;
if (len == 255) {
do len = *ip++, length += len; while (len == 255);
// verify that cpy=op+length below will not wrap.
if unlikely(op + length < op) goto _output_error;
}
}
I unrolled the code a bit, so this code is actually FASTER than the old code.
Original comment by strig...@gmail.com
on 6 Dec 2012 at 10:59
Also, it should be size_t.
size_t len, length;
Original comment by strig...@gmail.com
on 6 Dec 2012 at 11:01
Here's a possible fix for the second half of the bug:
// get matchlength
if ((length=(token&ML_MASK)) == ML_MASK) {
length += *ip;
if (*ip++ == 255) {
do length += *ip; while (*ip++ == 255);
// verify that "cpy = op + length + 4" below does not wrap.
if unlikely(op + length + 4 < op) goto _output_error;
}
}
It's also fast in my benchmark.
Original comment by strig...@gmail.com
on 6 Dec 2012 at 11:16
That's impressive.
My dev station is currently down, but i'll repair it and give it shot a.s.a.p
Original comment by yann.col...@gmail.com
on 7 Dec 2012 at 12:00
Hi
I tried your suggested code,
but for some reason,
it did not result in performance improvement on my station.
In fact, it's the other way round :
benchmark measure a loss, about 1-2%.
Original comment by yann.col...@gmail.com
on 8 Dec 2012 at 10:33
Another feedback;
I've switched to Visual 2012.
I expect the optimisations of VC2012 and VC2010 to be different, which could
explain why in my previous experiment, i witnessed a loss of performance
instead of your reported win.
To my surprise, the loss of performance was larger under VC2012, at
approximately -10%.
So now i'm wondering, maybe i'm doing it wrong.
In my experiment, i've replaced the line :
if ((length=(token>>ML_BITS)) == RUN_MASK) { size_t len; for (;(len=*ip++)==255;length+=255){} length += len; }
with your suggestion :
if ((length=(token>>ML_BITS)) == RUN_MASK)
{
BYTE len = *ip++; length += len;
if (len == 255)
{
do len = *ip++, length += len; while (len == 255);
// verify that cpy=op+length below will not wrap.
if unlikely(op + length < op) goto _output_error;
}
}
Both are therefore followed by :
cpy = op+length;
if unlikely(cpy>oend-COPYLENGTH)
{
if (cpy != oend) goto _output_error; // Error : not enough place for another match (min 4) + 5 literals
memcpy(op, ip, length);
ip += length;
break; // EOF
}
Is that the way it is supposed to be implemented ?
Original comment by yann.col...@gmail.com
on 10 Dec 2012 at 10:05
This is the code I test with. You see my benchmark numbers in the comments,
it's very close, but I'm surprised I don't get any negative effects from the
added out of bounds check. I don't know why you get worse number. Maybe our
CPUs are different.
#if 1
// Running this three times gave the following: 2176MB/s 2184MB/s 2204MB/s
if ((length=(token>>ML_BITS)) == RUN_MASK) {
len = *ip++, length += len;
if (len == 255) {
do len=*ip++, length += len; while (len == 255);
// verify that cpy=op+length below will not wrap.
if unlikely((uintptr_t)op + length < (uintptr_t)op) goto _output_error;
}
}
#else
// Running this three times gave the following: 2180MB/s 2166MB/s 2169MB/s
if ((length=(token>>ML_BITS)) == RUN_MASK) { for (;(len=*ip++)==255;length+=255){} length += len; }
#endif
Original comment by strig...@gmail.com
on 10 Dec 2012 at 10:20
Small comment (almost a note to myself) :
it's worth remembering that
LZ4_compress() only authorizes input sizes <= ~1.9GB.
As a consequence, it's not possible for any copy size to be > 2GB.
So, if nonetheless such a value is encoded in the compressed stream,
then it must be a malformed/malicious data.
Original comment by yann.col...@gmail.com
on 22 Dec 2012 at 7:24
Second note to self :
the issue can only happen in 32-bits mode.
It's not possible for the pointer value to wrap-around address space in 64-bits
mode.
Which means the test can be restricted to 32-bits mode only.
Original comment by yann.col...@gmail.com
on 30 Mar 2013 at 11:20
[deleted comment]
Hi, I just disclosed a similar bug in another compression scheme to the Linux
kernel team. Their implementation of LZ4 is also affected. If you could, please
bump this to High priority because it can be controlled for precise memory
corruption. This vulnerability can lead to code execution.
Original comment by d...@securitymouse.com
on 20 Jun 2014 at 6:42
Note that this bug cannot be used within Linux kernel.
It happens only in 32-bits mode, and only with blocks > 8MB.
Such situation is not possible for the version within Linux kernel, since it
uses the Legacy file compression format, which has a hard block size limit at 8
MB (the newer format has a hard limit at 4 MB).
For this bug to happen, you need to create a specific/custom format, generating
compressed blocks > 8MB, in 32-bits mode. It's a very specific combination
which happens very rarely, if ever. And more importantly, it does not happen
within the Linux kernel implementation.
Nonetheless, this bug will be corrected within an upcoming version.
Original comment by yann.col...@gmail.com
on 20 Jun 2014 at 9:07
Latest LZ4 release within "dev" branch fixes this issue :
https://github.com/Cyan4973/lz4/tree/dev
Ludwig Strigeus' overflow test is now integrated into the fuzzer test tool,
and passed correctly by both 32-bits & 64-bits versions.
Regards
Original comment by yann.col...@gmail.com
on 22 Jun 2014 at 10:46
Original comment by yann.col...@gmail.com
on 22 Jun 2014 at 10:47
Fixed into r118
Original comment by yann.col...@gmail.com
on 26 Jun 2014 at 9:50
Hi Yann,
Sorry communication went poorly on this issue. I did not see your responses to
this flaw. Regardless, thank you for fixing the code. I appreciate it.
Best,
D
Original comment by d...@securitymouse.com
on 27 Jun 2014 at 4:57
hum, yes it did.
A flaw in a largely deployed piece of code is not a "gold fame nugget". It's a
risk for the community. I can only hope Oberhummer had enough time to design
and deploy a fix, before the risk was exposed. This *should* be at the core
value of a security firm.
My other grudge is that the risk regarding LZ4 was overblown in your article,
even though you were explained it wasn't possible to use it within currently
known implementations, due to block size limits. It's a good thing to correct
this risk for future implementations, but it's not a good idea to make the
world believe it is at risk *right now*, while it's not true. No known
implementation get even near the block size required to trigger the risk, this
is applicable to the Linux kernel version too.
Original comment by yann.col...@gmail.com
on 27 Jun 2014 at 8:22
I appreciate your perspective, Yann. But, I think you misunderstand how I've
looked at this issue. I've worked in infosec for 15 years, I definitely
understand the risks involved.
I coordinated with every other team involved in this. I worked with each team
to build patches (except for Markus, who developed his on his own). We
coordinated a release, and that timeline was unfortunately broken by a public
git push. Even if that didn't happen, the Linux kernel team (Linus made this
very clear) merges security related fixes to mainline within 5 days of
disclosure, no exceptions. So, they would have gone forward regardless.
I think you're seeing this from an unfortunate point of view because we were
unable to coordinate. There are two things that caused this:
1) I found the Linux kernel LZ4 bug when we were performing the LZO triage. In
looking for other implementations, I found yours. I saw the bug report here and
realized that this issue had already been set aside as a non issue. Seeing this
was a year+ old it occurred to me that I may never hear from you, and since I
already had the Linux team's ear, I went ahead and coordinated with them,
building a patch for LZ4 with Greg.
2) Google Code requires me to sign in with my Google account to post on this
bug report site. But, it doesn't alert me (as I presumed it would) when a
response to one of my posts is made public. I presumed that like every other
Google product it would alert me when I had a response in an interface. It did
not, and I poorly presumed that you had ignored my post here. My apologies for
that.
So, yes, everyone was given enough time. Patches were all developed before my
press and blog post. Things were coordinated appropriately, and FWIW even Solar
Designer said that things went well with almost non-existent issues.
I know you were upset, but please don't speculate on things you did not have
insight into. I'm sorry you were left out of the loop, but I am only here to
help you, not hurt you.
Regarding LZ4 being overblown, you are right that implementations are not
affected. If you would have finished reading my original blog post you would
have noted that I stated this exact thing in my blog by referencing the ZFS
128k block limitation. That isn't the point. The point is that the algorithm
doesn't care about a size limit for buffers, and there is no documentation in
the code to require a limit. This, just as with the LZO implementation, will
eventually lead to people that misuse the API. That's just a fact of
engineering. So, yes, it is a vulnerable algorithm and yes, it needed patching.
My PoC demonstrates how the algorithm itself is vulnerable. It does not matter
whether it is in kernel or user mode. The demonstration proves that the
algorithm itself allows for random memory overwrite. Having written kernels for
multiple architectures, and having written kernel exploits for everything from
Plan 9, to AVR, to Linux and BSD, I think I get the difference :-)
Thanks, Yann.
D
Original comment by d...@securitymouse.com
on 27 Jun 2014 at 4:42
By the way, you should give me a mailing address so I can apologize to you
using my custom Lab Mouse Security Sympathy Card :-) At least we can resolve
this with a smile and a laugh or two. Hopefully someday I can buy you a beer.
https://www.etsy.com/listing/193821961/sorry-to-make-you-patch-sympathy-card
Original comment by d...@securitymouse.com
on 27 Jun 2014 at 4:45
OK, this is indeed a much better end.
As you have already guessed, the only elements I could perceive so far :
- someone letting a comment on an already opened issue
- don't give a damn to my answers
- publish just a few days later a big headline-grabbing blog post
- pretending to have found the bug
- and dressing it heartbleed-like (RCE!)
With such elements at disposal, the sole logical conclusion I could come up
with was an irresponsible fame-grab attempt, even if it would put millions of
users at risk.
Hence my immediate anger.
However, the scenario you are describing above is much, much better.
I could not guess it up to now, but if correction and deployment was indeed
coordinated with the relevant parties involved before the article publication,
then it's a much better standard of conduct, which I do respect.
Original comment by yann.col...@gmail.com
on 27 Jun 2014 at 5:59
Btw, it's more minor, but be informed that I don't agree with the general tone
of your blog post article. It's not that the elements presented are technically
wrong, but the weight of risk seems unfairly exaggerated.
It's correct to say there is a vulnerability, which requires some specific
conditions to be usable. However, it's critical to underline that *no currently
known program* using LZ4 today matches these conditions, including programs
within the Linux Kernel.
So yes, I've seen your note about ZFS, but it's clearly not enough. It's almost
hidden in the middle of your article. Most casual readers haven't reached it,
nor understood the general consequences.
Just roam the forums and social networks, and you'll see most of them jump to
conclusions on reading mostly the headline and the nice conclusion table :
*current* programs using LZ4 or LZO are basically big security holes.
I can't comment on LZO variants, but for registered LZ4 programs, that's not
true.
Yes, future programs may inadvertently fall into the pit, and as a consequence,
the vulnerability must be patched (and has been patched btw).
But that's not correct to extend that logic to existing programs. Existing
programs are safe, because they don't match the conditions to exploit the
vulnerability.
And that's *extremely* important. This is *the* huge difference with a
*heartbleed like* bug disclosure.
In conclusion, I love your latest description, which seems fair and correct to
me :
"you are right that implementations are not affected. That isn't the point. The
point is that there is no documentation in the code to require a limit. This
will eventually lead to people that misuse the API. So, yes, it is a vulnerable
algorithm and yes, it needed patching."
Yes. E-xac-tly.
I would love it to be understood this way by your readers.
Regards
Original comment by yann.col...@gmail.com
on 27 Jun 2014 at 6:38
Yeah, I think the the tone of the post was written from a different perspective
than you have, which is why we ended up disagreeing again.
You're looking at it from the point of the current threat surface within known
implementations. I'm looking at it from the sole point of view of the library
function's vulnerability, and the potential for abuse in the future.
The ratings were all given fair levels of importance based on how the functions
themselves can be exploited, not based on the implementations. This is because
I do not know - nor does anyone know - how each implementation is used, or
whether it is used correctly. So, I only spoke from a perspective of the
information I could verify. This is why the Linux kernel LZO RCE was deemed
"Impractical" in the blog post and in the bug report. I can prove DoS using the
bug, but can the library function be instrumented alone for RCE? No. At least,
not by me.
I put things into the context that was fair for the release, as I refuse to
conjecture about what systems may or may not be vulnerable. Also, by describing
the result of exploiting the function itself, I give each engineer implementing
a use case of the function the power to know how and what will result if the
function is used in a way that exposes the application to risk. That is useful
information that can help significantly reduce the threat surface of a target
application.
As far as LZ4 variants, it turns out a lot of A/V utilities automatically
unpack LZO and LZ4 compressed files. Have you verified that they are all using
the block sizes correctly? I think that there are probably a lot of
implementations out there that are not using the APIs correctly because they
can't.
A/V is a great example of this because they have to be able to decode LZ4
payloads that are potentially mangled on purpose to bypass A/V checks. This
might be a great place to look for potential fixes.
Regardless, I am very happy to see that we are starting to see eye to eye.
Thanks for taking the time to talk with me here. I appreciate it a lot :-)
D
Original comment by d...@securitymouse.com
on 27 Jun 2014 at 7:14
If I do understand the situation correctly, A/V unpacking automatically LZ4
files must use the official documented LZ4 format (otherwise, they would not be
able to read the file).
This format is safe, because it limits the maximum block size to 4 MB (8 MB if
legacy format). To be fully safe, the decoder mush also check block size
condition before loading the block, but this is a necessity, due to memory
allocation management. And this is btw properly done within the reference code
lz4io.c.
Of course, I can't guarantee they are using the reference file reader code. Nor
can I guarantee that they use the reference algorithm lz4.c either. But then,
potential vulnerabilities are endless.
Original comment by yann.col...@gmail.com
on 27 Jun 2014 at 7:34
Original issue reported on code.google.com by
strig...@gmail.com
on 5 Dec 2012 at 7:50