gramineproject / graphene

Graphene / Graphene-SGX - a library OS for Linux multi-process applications, with Intel SGX support
https://grapheneproject.io
GNU Lesser General Public License v3.0
772 stars 261 forks source link

SGX protected files possible vulnerability #2360

Open shmeni opened 3 years ago

shmeni commented 3 years ago

There might be a possible vulnerability in the implementation of the protected files feature. Specifically, we tested the following code that opens two protected files (configured the tmp dir to be protected according to the very intuitive examples you provide)

int fd1 = open("tmp/sec1.txt", O_CREAT | O_RDWR, 0644); int fd2 = open("tmp/sec2.txt", O_CREAT | O_RDWR, 0644);

Then we tested what would happen if an attacker changed the return value of open so that the same file descriptor value was used for both files. Interestingly that led to a crash inside the enclave, debugging showed that it was due to accessing invalid memory in the ipf_close function. The specific culprit is the following line: https://github.com/oscarlab/graphene/blob/1d5dfb4018d865894e0cd959a1c2a91ebfe8749d/Pal/src/host/Linux-SGX/protected-files/protected_files.c#L354

dimakuv commented 3 years ago

Then we tested what would happen if an attacker changed the return value of open so that the same file descriptor value was used for both files. Interestingly that led to a crash inside the enclave...

Could you give a more complete example? It's unclear what your program tries to do next with these two file handles. Do you try to read from them? Write to them? Simply close them immediately?

shmeni commented 3 years ago

Just opening and closing the files is enough to trigger the crash. Here's the complete code we used: void test() { int fd1 = open("tmp/sec1.txt", O_CREAT | O_RDWR, 0644); int fd2 = open("tmp/sec2.txt", O_CREAT | O_RDWR, 0644); close(fd1); close(fd2); }

For completeness, we configured tmp dir in the manifest to be protected: sgx.protected_files.tmpdir = file:tmp Finally, our test intercepted the open() call and returned the same fd value for both open requests (from the "untrusted" code part).

dimakuv commented 3 years ago

Thanks @shmeni! This is indeed a bug in our code (not sure if this could be exploited).

The root cause is that we don't check whether pal_handle->file.fd returned by ocall_open() is already assigned to some other PAL handle (see https://github.com/oscarlab/graphene/blob/da0bd9585a4209c4cf769fa65bcad93d6cb02c2c/Pal/src/host/Linux-SGX/db_files.c#L70).

We do have such checks at the LibOS layer (which has its own FD map), but there are no such checks at the PAL layer. This was probably Ok for "simple" handles, but leads to corner cases with Protected Files.

I guess we should introduce a map of FDs at the PAL level as well, and return errors when we detect such a case? What do you think @mkow @boryspoplawski @pwmarcz @yamahata ?

P.S. To be honest, I didn't debug why it segfaults as Meni mentioned in the top comment. It looks like the pf (protected file object) was already destroyed during the second close(), though I don't see why.

boryspoplawski commented 3 years ago

I'm not that familiar with our pf implementation, but why cannot we allow for two handles to have the very same fd? If we break the underlying file on the host it's fine (since the host is messing with us anyway). To me it doesn't look like the duplicated fd is the culprit...

dimakuv commented 3 years ago

I agree that this is hardly the root cause for this issue (looks like some missing checks on Protected Files code?).

But shouldn't this be considered a part of sanitization of OCALL return values? An open() returning an already-existing FD is surely a wrong return value.

boryspoplawski commented 3 years ago

Just sounds like an useless thing - it doesn't stop any attack whatsoever, can only hide issues in other places.

mkow commented 3 years ago

Yup, and the host can always duplicate an FD to create an illusion of separate descriptors which are in fact mapped to the same resource. It's just that our code should work correctly in also such cases.

dimakuv commented 3 years ago

@shmeni I cannot reproduce your crash. Here is my diff on the latest Graphene master branch:

--- a/LibOS/shim/test/regression/helloworld.c
+++ b/LibOS/shim/test/regression/helloworld.c
@@ -1,6 +1,15 @@
+#include <fcntl.h>
 #include <stdio.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>

 int main(int argc, char** argv) {
+    int fd1 = open("tmp/sec1.txt", O_CREAT | O_RDWR, 0644);
+    int fd2 = open("tmp/sec2.txt", O_CREAT | O_RDWR, 0644);
+    close(fd1);
+    close(fd2);
+
     printf("Hello world!\n");
     return 0;
 }

diff --git a/LibOS/shim/test/regression/manifest.template b/LibOS/shim/test/regression/manifest.template
index e8da3ea7..bab2b9db 100644
--- a/LibOS/shim/test/regression/manifest.template
+++ b/LibOS/shim/test/regression/manifest.template
@@ -33,10 +33,12 @@ sgx.trusted_files.libstdcxx = "file:/usr{{ arch_libdir }}/libstdc++.so.6"
 sgx.trusted_files.entrypoint = "file:{{ entrypoint }}"
 sgx.trusted_files.exec_victim = "file:exec_victim"

-sgx.allowed_files.tmp_dir = "file:tmp/"
+sgx.protected_files.tmp_dir = "file:tmp/"
 sgx.allowed_files.root = "file:root" # for getdents test
 sgx.allowed_files.testfile = "file:testfile" # for mmap_file test

 sgx.thread_num = 16

 sgx.nonpie_binary = 1
+
+sgx.protected_files_key = "ffeeddccbbaa99887766554433221100"

Then I build this and run in debug mode: GDB=1 graphene-sgx ./helloworld. I manually replace the result of the second ocall_open() with the FD of the first protected file (in my case fd = 15).

And here is the output:

error: cb_write(15, 0xf67c040, 0, 4096): write failed: -9

So as expected, I got a bad write error (-9 is the UNIX error code EBADF /* Bad file number */). I observe no crash inside the enclave.

boryspoplawski commented 3 years ago

I also failed to reproduce it (modified sgx_ocall_open). @shmeni Could you provide more details or even complete PoC?

shmeni commented 3 years ago

Sorry about the lack of details @dimakuv. Let me try again:

Here's the strace output for the example (I've modified the same helloworld test as you did, so I got the same fd numbers), I've marked down the exact location where I changed the values. I think my confusion was that I assumed the protected files open the files regularly, but it actually opens the file twice. Changing the second open fd causes the PAL segfault.

open("./tmp/sec1.txt", O_RDONLY|O_NONBLOCK|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("tmp/sec1.txt", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("./tmp/sec1.txt", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0644) = 14
fstat(14, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat(14, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
open("./tmp/sec1.txt", O_RDONLY|O_NONBLOCK|O_CLOEXEC) = 15
fstat(15, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fcntl(15, F_GETFL)                      = 0x8800 (flags O_RDONLY|O_NONBLOCK|O_LARGEFILE)
fcntl(15, F_SETFL, O_RDONLY|O_LARGEFILE) = 0
close(15)                               = 0
open("./tmp/sec2.txt", O_RDONLY|O_NONBLOCK|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("tmp/sec2.txt", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("./tmp/sec2.txt", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0644) = 15
fstat(15, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat(15, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
open("./tmp/sec2.txt", O_RDONLY|O_NONBLOCK|O_CLOEXEC) = 16

sys_open changed return value from 16 to 15

fstat(15, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fcntl(15, F_GETFL)                      = 0x8002 (flags O_RDWR|O_LARGEFILE)
--- SIGILL {si_signo=SIGILL, si_code=ILL_ILLOPN, si_addr=0x55618587f65f} ---
rt_sigreturn({mask=[USR2]})             = 3
pwrite64(15, "GRAPH_PF\1\0\301\347'\353\370\v\332(\t=\177gx\326'\354\231_8\304\366\315"..., 4096, 0) = 4096
close(15)                               = 0
pwrite64(14, "GRAPH_PF\1\0007B\337\3513\2477\320\251\370:\371\342\347\372\205`1\323C\23\327"..., 4096, 0) = 4096
close(14)                               = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x3000} ---
rt_sigreturn({mask=[USR2]})             = 3
write(2, "error: *** Unexpected exception "..., 78error: *** Unexpected exception occurred inside PAL at RIP = +0x000276e7! ***
) = 78
write(2, "error: (untrusted PAL sent PAL e"..., 42error: (untrusted PAL sent PAL event 0x2)
) = 42
write(2, "error: rax: 0x00000000 rcx: 0x00"..., 256error: rax: 0x00000000 rcx: 0x00000000 rdx: 0x00000000 rbx: 0x06044200
rsp: 0x0b5318d0 rbp: 0x0b531900 rsi: 0x00000000 rdi: 0x00000000
r8 : 0x0b62da99 r9 : 0x0601ffc0 r10: 0x000001a4 r11: 0x00000246
r12: 0x060440a0 r13: 0x06088ef0 r14: 0x0b5e3bb2 r15: 0x0b) = 256
write(2, "531f38\nrflags: 0x00010202 rip: 0"..., 42531f38
rflags: 0x00010202 rip: 0x0b6486e7
) = 42
exit_group(1)                           = ?
+++ exited with 1 +++

Just so it'll be easy to reproduce without GDB - I created a dummy hack in the open call that returns the wrong value in this particular case (it should be safe to ignore my commenting out of the /dev/kmsg mount as it was just due to some trouble I faced with the latest master branch). Fyi, tested it with an empty tmp dir such that the files didn't exist.

diff --git a/LibOS/shim/test/regression/helloworld.c b/LibOS/shim/test/regression/helloworld.c
index 105b300c..a64accea 100644
--- a/LibOS/shim/test/regression/helloworld.c
+++ b/LibOS/shim/test/regression/helloworld.c
@@ -1,6 +1,15 @@
+#include <fcntl.h>
 #include <stdio.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>

 int main(int argc, char** argv) {
-    printf("Hello world!\n");
-    return 0;
+    int fd1 = open("tmp/sec1.txt", O_CREAT | O_RDWR, 0644);
+    int fd2 = open("tmp/sec2.txt", O_CREAT | O_RDWR, 0644);
+    close(fd1);
+    close(fd1);
+    close(fd2);
+
+     printf("Hello world!\n");
+     return 0;
 }
diff --git a/LibOS/shim/test/regression/manifest.template b/LibOS/shim/test/regression/manifest.template
index e8da3ea7..a2d02c73 100644
--- a/LibOS/shim/test/regression/manifest.template
+++ b/LibOS/shim/test/regression/manifest.template
@@ -22,9 +22,9 @@ fs.mount.bin.type = "chroot"
 fs.mount.bin.path = "/bin"
 fs.mount.bin.uri = "file:/bin"

-fs.mount.devkmsg.type = "chroot"
-fs.mount.devkmsg.path = "/dev/kmsg"
-fs.mount.devkmsg.uri = "dev:/dev/kmsg"
+#fs.mount.devkmsg.type = "chroot"
+#fs.mount.devkmsg.path = "/dev/kmsg"
+#fs.mount.devkmsg.uri = "dev:/dev/kmsg"

 sgx.trusted_files.runtime = "file:{{ graphene.runtimedir() }}/"
 sgx.trusted_files.libgcc_s = "file:{{ arch_libdir }}/libgcc_s.so.1"
@@ -33,10 +33,12 @@ sgx.trusted_files.libstdcxx = "file:/usr{{ arch_libdir }}/libstdc++.so.6"
 sgx.trusted_files.entrypoint = "file:{{ entrypoint }}"
 sgx.trusted_files.exec_victim = "file:exec_victim"

-sgx.allowed_files.tmp_dir = "file:tmp/"
+sgx.protected_files.tmp_dir = "file:tmp/"
 sgx.allowed_files.root = "file:root" # for getdents test
 sgx.allowed_files.testfile = "file:testfile" # for mmap_file test

 sgx.thread_num = 16

 sgx.nonpie_binary = 1
+
+sgx.protected_files_key = "ffeeddccbbaa99887766554433221100"
diff --git a/Pal/src/host/Linux-SGX/sgx_enclave.c b/Pal/src/host/Linux-SGX/sgx_enclave.c
index 76945af5..b665c005 100644
--- a/Pal/src/host/Linux-SGX/sgx_enclave.c
+++ b/Pal/src/host/Linux-SGX/sgx_enclave.c
@@ -113,6 +113,11 @@ static long sgx_ocall_open(void* pms) {
     // FIXME: No idea why someone hardcoded O_CLOEXEC here. We should drop it and carefully
     // investigate if this cause any descriptor leaks.
     ret = INLINE_SYSCALL(open, 3, ms->ms_pathname, ms->ms_flags | O_CLOEXEC, ms->ms_mode);
+
+    if ((strcmp(ms->ms_pathname, "./tmp/sec2.txt") == 0) && ret == 16) {
+           return 15;
+    }
+
     return ret;
 }
boryspoplawski commented 3 years ago

Thanks for the details, I was able to reproduce the crash. (you have close(fd1) twice, but that does not influence the problem in any way)

dimakuv commented 3 years ago

@shmeni Would be interesting if you could cause a crash on Protected Files now, with #2372.

shmeni commented 3 years ago

@dimakuv Happy to say that I don't see this crash anymore with #2372

dimakuv commented 3 years ago

My hacky fix in #2372 is sub-par. We should actually refactor Protected Files in a much more comprehensive way, because it also shows it deficiencies for @pwmarcz's work on the FS.

Currently postponing resolving this issue. Until we have a proper PF re-implementation.