gplotMakeOutput Command Injection Vulnerability

ghost commented 6 years ago

An exploitable command injection vulnerability exists in the gplotMakeOutputfunction of Leptonica 1.74.4. A specially crafted gplot rootname argument can cause a command injection resulting in arbitrary code execution. An attacker can provide a malicious path as input to an application that passes attacker data to this function to trigger this vulnerability.

For details refer to this write-up. This has been assigned CVE-2018-3836.

DanBloomberg commented 6 years ago

That command injection vulnerability has been removed in 1.75.1.

ghost commented 6 years ago

The commit in question fixed several issues: b88c821f8d347bce0aea86d606c710303919f3d2

For those interested in a backport, have a look at this.

regiwils commented 6 years ago

This is the correct CVE-2018-3836 for this issue

bwhacks commented 6 years ago

This doesn't seem to be fixed, as the "bad character" check still allows e.g. "$(command)".

carnil commented 6 years ago

FTR, the issue as raised by @bwhacks in https://github.com/DanBloomberg/leptonica/issues/303#issuecomment-366472212 got assigned CVE-2018-7440

DanBloomberg commented 6 years ago

@carnil

And it was fixed 4 hours ago on the master. Is there anything further that needs to be done?

santiagorr commented 6 years ago

Also related to this issue: gplotMakeOutput function does not block either the '/' character. CVE-2018-7442 has been assigned for a potential path traversal and arbitrary file overwrite.

DanBloomberg commented 6 years ago

I don't understand this comment.

The '/' character is required for paths. The fix in gplotCreate() guarantees that gplot->cmdname is a legitimate path.

egorpugin commented 6 years ago

As I understand it will be much safer to not use system() calls. It's very very hard to protect and prepare command line for it. Better to call via exec* family or other OS specific calls.

Am I missing something?

DanBloomberg commented 6 years ago

I admit to not knowing anything about the differences in security when using system as opposed to fork/exec/wait.

Can someone point me to some high-level references on this issue?

bwhacks commented 6 years ago

The problem with allowing "/" is that a base name can then be a relative path referring to any part of the filesystem. Suppose I can get a server or some other user to process a file with leptonlib, and it will generate an output file using the path base_name + ".ext", and suppose that I have access to /home/bwhacks on the same system.

I can set the base name to "../../../../../../../../../../home/bwhacks/evil" and put a symlink at "/home/bwhacks/evil.ext" that points to a file belonging to the targetted user. When they process the file, they'll write through that symlink and trash one of their own files. So this would be a denial of service attack. It's not a particularly likely method of exploitation, but it ought to be closed off.

bwhacks commented 6 years ago

The system() function will invoke the system shell (/bin/sh on Posix systems, cmd.exe on Windows), which is what processes redirection, environment variables, etc.

On Posix systems, fork()+execve() bypasses the shell, meaning that there's no need to worry about all the shell's special characters, or whether to quote arguments.

On Windows, you can either use the native process creation API (CreateProcess) or the C library's spawnve(). But if you need to support spaces in arguments, spawnve() may not do so correctly; see https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way.

stweil commented 6 years ago

For POSIX, posix_spawnp would provide similar functionality as Windows spawnvpe.

amitdo commented 5 years ago

Can someone point me to some high-level references on this issue?

SEI CERT C Coding Standard ENV33-C. Do not call system() https://wiki.sei.cmu.edu/confluence/pages/viewpage.action?pageId=87152177

DanBloomberg commented 5 years ago

Hi Amit,

Even though I believe we have addressed the vulnerabilities in the reports (and quite a few others), I would like to keep this issue open for a while, because security is an on-going process and we have not had formal closure on some of the reports.

For background, the situation is that people are looking very carefully at issues related to security, and are filing issues where they have found vulnerabilities.

We have been taking their suggestions seriously, and fixing the code to prevent possible exploits as quickly as possible. Major changes were made between 1.74 and 1.76, for example, to prevent creating named temp directories, writing to named files, and forking other processes in our production code. These capabilities are still available for testing, as they must be.

At present, we are also running the best Google fuzzing software on a fairly continuous basis on leptonica. This is where different "seeds" are used as input to programs, with efficient sanitizers, in an effort to cause (and detect) memory errors. A number of issues came up initially, but none in the past several weeks. After fixing, all have been quickly pushed to the master.

We are also running Coverity Scan and LGTM to clean up the code. The progress has been significant. And I have continued to write new formal regression tests to improve code coverage. valgrind has been employed on every test. We now have about 125 tests in the suite. These tests take about 80 seconds to run using optimized (-O2) code, and write about 160 MB of data (mostly images) in 1700 files.

I have not heard back from the people doing the security checks that wrote the alerts earlier this year. I do expect that they will revisit the issues at some point.

-- Dan

amitdo commented 5 years ago

Great summary of the security work done for Leptonica, Dan.

amitdo commented 5 years ago

Another reference from SEI CERT C Coding Standard related to this thread: Input Output (FIO) FIO02-C. Canonicalize path names originating from tainted sources https://wiki.sei.cmu.edu/confluence/display/c/FIO02-C.+Canonicalize+path+names+originating+from+tainted+sources

Dan, you already changed the code to use realpath() as suggested there, so probably no action is needed.

DanBloomberg / leptonica

gplotMakeOutput Command Injection Vulnerability #303