Question: how does an "insecure" library become "secure"?

lovell commented 8 years ago

Hi Nathanael, I was alerted to this rather interesting repo via your recent questions about libjpeg-turbo.

One thing that caught my eye was the statement in the readme about "trusted data":

"...all the libraries that I've reviewed are insecure. Some assume all images are trusted data (libvips)..."

libvips relies on the "standard" image decoding libraries such as libjpeg-turbo, libpng etc., most of which have seen the recent attention of fuzzing tools such as AFL.

Are you able to explain a little more about what is meant by "insecure" and "trusted data" in this context? What would you expect libraries to be doing in order to become "secure"?

My best guess is that this might relate to "trusted metadata", where fake headers could result in large memory allocations, assuming that data was able to get through the decoding library's own checks first.

Cheers, Lovell

/cc @jcupitt

jcupitt commented 8 years ago

On libvips, as @lovell says, it mostly relies on decoding libraries to do sanitisation. It also sanity checks values it gets from these libraries, for example it checks for integer overflow on image width * height, a classic way to trigger out of bounds reading. I wouldn't claim that the checks are complete though, I'm sure a determined attacker could find a way through.

There's also the question of trust of metadata. The vips jpeg loader passes the IPCT block from the image on to users, for example, and does not attempt to test it for validity. That could lead to the compromise of something further down the chain, if there's a bug in an xml parser.

It has some loaders which we've coded ourselves, for example for reading CSV data. Again, these do try to be robust in the face of malicious data, but who knows, they've not been fuzzed much, as far as I know.

Another very common way to attack libraries is via environment variables. If an attacker can manipulate the environment variables a privileged process is passed it can often trigger bad behaviour in some library that the process uses and inject code. libvips has had a CVE about this in the past.

Image load libraries are extremely large and complex, and because libvips uses many of these libraries, it has a huge surface area for attack. It would be difficult to ever guarantee that it is really hacker-proof. Most iPhone jailbreaks are via the tiff library, for example, but that's actually relatively secure, since it's been around forever. Consider OpenJPEG, the usual jp2k library, it's still undergoing rapid development, it must have huge holes.

I would say the solution is layered security.

Only enable the load libraries you really need. For example, libvips will open microscope slide images, which most websites will not require.
Keep all the image load libraries patched and updated daily.
Keep the image handling part of a site in a sandbox: a separate process, or even a separate machine, running as a low-privilege user.
Kill and reset the image handling system regularly, perhaps every few images.

Sandboxing is easy to do on most operating systems and probably a feature of most frameworks (or it certainly should be).

This would apply to every image handling system, of course, not just libvips.

jcupitt commented 8 years ago

libvips has a test suite, by the way, I don't know if you saw it:

https://github.com/jcupitt/libvips/tree/master/test

make check runs a rather old test of the CLI interface, but ./test_python.sh tests the whole thing from python2 and python3. Some parts are exercised more carefully than others. It runs cleanly through valgrind, or did last time I had the patience to try it.

I've not tried to get any coverage stats, that would probably be the next thing to do.

lilith commented 8 years ago

First, let me say that I love libvips - my comment was not intended to disparage it. And since I wrote those words two or three years ago (copied from a previous repo), libvips has begun seeing a massive increase in usage outside of its original problem space/market, and is now used in many popular image servers. @jcupitt has done a lot of security and verification work since then. But attack surface area wasn't part of the initial design criteria, as it was originally written for use with 'safe' images (as was ImageMagick and pretty much every other tool out there). Which is not to say that is can't be completely secured against malicious data! It's just a bit harder when you have to take away functionality.

@jcupitt has provided perfect advice here, which is achievable in a linux environment:

I would say the solution is layered security.

Only enable the load libraries you really need. For example, libvips will open microscope slide images, which most websites will not require.

Keep all the image load libraries patched and updated daily.

Keep the image handling part of a site in a sandbox: a separate process, or even a separate machine, running as a low-privilege user.

Kill and reset the image handling system regularly, perhaps every few images.

This absolutely applies to ImageMagick as well.

Unfortunately, privilege escalation in Windows is trivial to achieve, and sandboxing on that platform is something neither Microsoft, nor Google, nor Adobe has ever had solved for more than a couple weeks at a time. More or less, this is a linux-only solution until Docker on Windows shows itself to have a decent security record.

Given that 95% of my existing user base hosts exclusively on Windows (to my deep sorrow), I have to provide something that can be safely used in-process, without sandboxing.

This problem isn't exclusive to Windows, either - the majority of PHP websites use LibGD or ImageMagic bindings to resize potentially malicious images - in process, with elevated privileges.

I doubt 0.01% of systems are sandboxing their image processing.

My goal with imageflow is to start out saying "no" to any features or functionality that could be an attack vector, yet provide the most commonly needed behaviors. We're only supporting image formats that browsers understand, and limiting ourselves to the most hardend codecs.

Part of the protection strategy is ensuring that every operation can calculate an upper bound of computation and RAM req prior to any heavy lifting. This will help schedulers and servers implement better DDOS attack prevention (as well as better load balancing).

We're also dealing with metadata on a whitelist basis, rather than pass-through. There are still a lot of vectors we have left to test, fuzz, and harden against, but hopefully with a small enough problem scope and codebase size we can achieve an acceptable level of security for in-process use.

jcupitt commented 8 years ago

Ah sounds very sensible. Though of course libpng and giflib have had a large number of serious vulnerabilities over the years. It looks like the last remote code execution bug on libpng was about a year ago:

https://www.cvedetails.com/vulnerability-list/vendor_id-7294/Libpng.html

Only a crash so far this year.

lilith commented 8 years ago

Yes, libpng's record gives me nightmares. I'm not aware of a better option, so I'll probably be spending some extra time fuzzing it for my own peace of mind, and reading the source end-to-end a few times. I'm on the fence about giflib. I've written my own gif decoders/encoders a few times (in managed languages), but that decision is delayed until I have had a full read-through of giflib.

Hoping to help get official libpng set up with automated valgrind/travis soon. My fork has it, but I recently switched to Conan for package management, and it needs to be updated.

Do you know how many greenfield projects are using known vulnerable versions of zlib? It pisses me off. I notify them (MSFT), and they add to the readme "Only use with trusted image data".

I'd love to chat with you sometime about libvips architecture. I've got a feeling you know some operation graph optimization techniques that would be really helpful in imageflow. Also, if it works out, it would be neat if imageflow offered a clean subset of libvips functionality, such that we can trivially transition users to libvips when they need more.

I've drafted an update to the security section of Readme, and took the liberty of quoting you, if that's ok? Let me know if there are any changes you would like to see, or a hyperlink added, etc. I'm trying to avoid tying particular transgressions to particular libraries, which makes it a bit vague.

https://github.com/imazen/imageflow/pull/36

jcupitt commented 8 years ago

Sure, #36 looks fine, though there's no need to be so nice about libvips (you'll make me blush!), I'd remove the text in brackets.

I suppose the point I was trying to make was that it's almost impossible to make a perfectly secure image handling library, since the loaders are all so wonky. You must have defence in depth, you can't rely on a single thing to keep you safe. Perhaps that's obvious!

I think it's fairly common for larger sites to have the image handling done on a separate server or in a VM. That's certainly what tumblr does, I don't know many others in much detail.

libvips doesn't really do much graph optimisation, though it does have nice things for handling sharing and for detecting reuse. For example, suppose you do (in Ruby for convenience, but it works in from any language):

a = Vips::Image.new_from_file "x.jpg"
b = a.avg

vips will decode x.jpg to a memory buffer or temporary file, then scan it to calculate the average. If some time later you execute:

c = a.avg

vips won't scan again, it'll immediately give you the previous result.

In the same way, it detects sharing in the graph. Suppose you execute:

a = Vips::Image.new_from_file "x.jpg"
a *= 42
r, g, b = a.bandsplit
r += g
a = r.bandjoin [g, b]
a.write_to_file "y.jpg"

(rather contrived, you'd really use .recomb for this, not split, add, join), there's a strong danger you could find yourself doing the a *= 42 three times. vips detects the fact that the image is being split and rejoined, and will "common up" the multiplication.

vips also runs lockless, which is fun. It's able to execute a complex, branching pipeline in parallel with a single lock for the read library and a single lock for write. This helps a lot on many-core machines, where mutexes become expensive.

lilith commented 8 years ago

I suppose the point I was trying to make was that it's almost impossible to make a perfectly secure image handling library, since the loaders are all so wonky. You must have defence in depth, you can't rely on a single thing to keep you safe. Perhaps that's obvious!

Obvious, yes, but not particularly common, unfortunately.

I'm hoping that native Rust codecs are mature by the time imageflow is production ready.

https://github.com/PistonDevelopers/image-png https://github.com/PistonDevelopers/image

Codecs are the perfect storm of complexity. Complier verification seems like the only decent hope.

Once imageflow is complete, I'm hoping to port it to Rust as well. Is it weird that I may be using C as a prototyping language? (Although, if Rust doesn't introduce more flexible iterators, I might be stuck with C).

lovell commented 8 years ago

Brilliant, thank you both for making the time to comment, this is a fantastic thread of conversation for anyone who worries about the security aspects of real-time image processing.

Hopefully John agrees that, thanks to a combination Valgrind, AFL and crash reports from users of sharp, libvips is in a more secure place than it was "two or three years ago".

Nathaniel, I hadn't appreciated the Windows-focussed nature of your work here (but even Docker on Linux has security gotchas to beware of such as gaining root privileges via volume mapping) so the very best of luck with this!

imazen / imageflow

Question: how does an "insecure" library become "secure"? #35