x64 openat gets wrong dirfd

remexre commented 6 years ago

Recent glibcs openat both /etc/ld.so.cache and /lib/x86_64-linux-gnu/libc.so.6 during program start. The x86/ARM version seems not to work on x64; simply applying

diff --git a/execution/linux_syscalls.ml b/execution/linux_syscalls.ml
index 9f455f6..d8211ec 100644
--- a/execution/linux_syscalls.ml
+++ b/execution/linux_syscalls.ml
@@ -4327,7 +4327,8 @@ object(self)
         | (X86, 294) -> (* migrate_pages *)
             uh "Unhandled Linux/x86 system call migrate_pages (294)"
         | (ARM, 322)    (* openat *)
-        | (X86, 295) -> (* openat *)
+        | (X86, 295)    (* openat *)
+        | (X64, 257) -> (* openat *)
          let (arg1, arg2, arg3) = read_3_regs () in
          let arg4 = (if (Int64.logand arg3 0o100L) <> 0L then
                        get_reg arg_regs.(3)

causes https://github.com/remexre/fuzzball/blob/cgc-branch/execution/linux_syscalls.ml#L1547 to raise an out-of-bounds exception; it looks like on x64, dirfd is 4294967196. Not so coincidentally, ((int) 4294967196L) == -100... Could something be truncating it?

smcc commented 6 years ago

I think it's the right intuition that this is a 32-bit vs. 64-bit problem with integer representation. But I don't think it's quite that the value is being truncated; it's actually more like that it's not getting truncated. The dirfd is an int, so it's a 32-bit value even on x64, but it's being stored in a 64-bit register, so it looks like 2^32-100 instead of -100 because it hasn't been sign-extended. OCaml "Int64" is a signed 64-bit type, and OCaml "int" is a signed type of an unpredictable width, but what's really going on is that the low half of the register is holding the dirfd as a signed 32-bit integer. To get negative bit patterns correctly translated into negative OCaml int, we need to get OCaml to interpret the 31st bit of the register value as the sign bit. I think it would be possible to do this by going Int64 -> Int32 -> int, but we've mostly avoided Int32 in FuzzBALL, so for us the more idiomatic operator is a 32-bit to 64-bit sign extension within an Int64 which is Exec_util.fix_s32. My guess is that instead of

dirfd = Int64.to_int arg1

we want something more like

dirfd = Int64.to_int (fix_s32 arg1)

You can see we do something similar for the "fd" argument of the x64 "mmap", which is analogous is that it is either a file descriptor or a special negative value.

It also looks to me like the fix_s32 would be correct on 32-bit platforms too, so you can still keep the architectures together as in the patch in your original message.

remexre commented 6 years ago

Huh, this is actually the first time I realized file descriptors are ints instead of ssize_t's... Thanks!

bitblaze-fuzzball / fuzzball

x64 openat gets wrong dirfd #27