immunant / c2rust

Migrate C code to Rust
https://c2rust.com/
Other
3.98k stars 238 forks source link

Miscompilation of varargs on some targets #246

Open pablosichert opened 4 years ago

pablosichert commented 4 years ago

Hi there,

first of all: thanks for this great tool, being able to transpile C to Rust is really valuable!

Unfortunately, the handling around varargs doesn't seem to work for all targets, specifically wasm32. To observe the behavior, I provided a C function:

https://github.com/pablosichert/rust-varargs-test/blob/e10aead1007cbfa923a006952436d0a324583da2/src/variadic.c#L3-L9

that will be compiled to the following Rust code:

https://github.com/pablosichert/rust-varargs-test/blob/e10aead1007cbfa923a006952436d0a324583da2/src/variadic.rs#L15-L28

and a test case:

https://github.com/pablosichert/rust-varargs-test/blob/e10aead1007cbfa923a006952436d0a324583da2/tests/tests.rs#L16-L37

that succeeds on native, but fails on wasm32:

https://github.com/pablosichert/rust-varargs-test/runs/543678432.

The problem here is that .as_va_list clones the list on wasm32, but is a mutable reference on native, see https://doc.rust-lang.org/src/core/ffi.rs.html#177-201.

I asked about this on the Rust #compiler channel on Discord, I'll just paste some of it here:

At a glance that does not seem likely to be a bug. as_va_list is explicitly for C interop, and C ABIs differ in how va_lists are represented. Note that the difference is not just wasm vs everything else, several "native" targets get the same impl as wasm. If you want deterministic by-ref vs copy behavior in a Rust program, you can use with_copy or pass a &mut VaList around. If you pass the va_list on to C, you probably should not keep using it afterwards.

It seems odd that as_va_list is publicly available, though. Please do file the issue, maybe the best fix is to change the API surface?

uh, nevermind the part about as_va_list being publicly available. If VaListImpl is generally exposed and used by-value, then its existence makes sense. But conversely, why is VaListImpl public? All questions that I can't answer, best file a rust-lang/rust issue too and then we can find someone who knows the history of that feature to clear it up

https://github.com/rust-lang/rust/pull/59625#issuecomment-482005611

I haven't opened an issue on https://github.com/rust-lang/rust yet, since I'd like to know if this issue would need to be / could be fixed on the transpilation level.

Code similar to my reproduction case can be found in the wild, e.g. SQLite: https://github.com/sqlite/sqlite/blob/ef7d5187a74fdddee3dabfaddebaf63ca316aac7/src/printf.c#L796.

ahomescu commented 4 years ago

@pablosichert Interesting find, we'll definitely fix it!

we can find someone who knows the history of that feature to clear it up

That would actually be myself (who implemented and tested the feature in rustc) and eddyb who proposed the final design and also reviewed my PR. I'm happy to help with either a C2Rust and rustc fix.

The rustc API basically tries to copy whatever clang does as faithfully as possible (modulo the as_va_list function and a few others which were unfortunate workarounds). Figuring out what clang does for wasm32 and why it does that was a pain in itself (it turned out that it basically replicates PNaCl which had its own design doc which isn't even available anymore).

        let mut argument: libc::c_int = arguments.as_va_list().arg::<libc::c_int>();

At first glance, this looks wrong to me. as_va_list should only be called on variadic argument callees, never on arg. @pablosichert If you manually remove that as_va_list call here and call arg directly on arguments, does that fix your problem? If so, then this looks like a C2Rust problem.

pablosichert commented 4 years ago

If I remove the the .as_va_list() here, it works (I also tried that as a first workaround). However, there were a lot more occurrences in the transpiled code and I could not unconditionally remove all .as_va_list() calls, as some of them would be needed when called on a VaListImpl.

I'm not sure if you can distinguish between those cases during transpilation?

I'll try to come up with an example for that tomorrow.

ahomescu commented 4 years ago

Yeah that definitely sounds like a C2Rust bug, you should never have as_va_list() just before an arg call.

Edit: I think I know what's going on: we add a as_va_list() call for every callee that takes a va_list including va_arg, which we shouldn't.

ahomescu commented 4 years ago

@pablosichert I pushed a fix, could you build C2Rust from commit https://github.com/immunant/c2rust/commit/9c018822691b7ed9e1da5df59fb75b655840f82d and try it?

pablosichert commented 4 years ago

Tried it and looks good at a first glance, thank you!

It still segfaults at a later point (possibly unrelated), so I'll do some more extensive testing today.

pablosichert commented 4 years ago

Your commit seems to fix this issue.

Do you feel there should still be a follow up on rustc? Regarding the following:

It seems odd that as_va_list is publicly available, though. Please do file the issue, maybe the best fix is to change the API surface?

uh, nevermind the part about as_va_list being publicly available. If VaListImpl is generally exposed and used by-value, then its existence makes sense. But conversely, why is VaListImpl public? All questions that I can't answer, best file a rust-lang/rust issue too and then we can find someone who knows the history of that feature to clear it up

pablosichert commented 4 years ago

It still segfaults at a later point (possibly unrelated)

That was due to https://github.com/immunant/c2rust/issues/248.

This case seems to be resolved.

pablosichert commented 4 years ago

I tried transpiling C using the -m32 flag, but it seems to break specifically with varargs.

See the diff https://github.com/pablosichert/rust-varargs-test/commit/ac1db0e02f17e9ea22c85d84fa40cface6c4466c and build error here:

 error[E0599]: no method named `arg` found for type `*mut i8` in the current scope
  --> src/variadic.rs:19:51
   |
19 |         let mut argument: libc::c_int = arguments.arg::<libc::c_int>();
   |                                                   ^^^ method not found in `*mut i8`
   |
   = note: try using `<*const T>::as_ref()` to get a reference to the type behind the pointer: https://doc.rust-lang.org/std/primitive.pointer.html#method.as_ref
   = note: using `<*const T>::as_ref()` on a pointer which is unaligned or points to invalid or uninitialized memory is undefined behavior

error[E0308]: mismatched types
  --> src/variadic.rs:36:12
   |
36 |     list = args.clone();
   |            ^^^^^^^^^^^^ expected *-ptr, found struct `std::ffi::VaListImpl`
   |
   = note: expected raw pointer `*mut i8`
                   found struct `std::ffi::VaListImpl<'_>`

The FAQ state that cross-compiling is not supported at the moment. Does that also include a restriction that transpiling on any architecture other than x86-64 is not supported?

ahomescu commented 4 years ago

Does that also include a restriction that transpiling on any architecture other than x86-64 is not supported?

Not explicitly, but we haven't tested any other architecture. The others might be broken in ways we don't know about.

Are you still targeting wasm32? It's possible that we incorrectly translate va_list on that.

pablosichert commented 4 years ago

Are you still targeting wasm32? It's possible that we incorrectly translate va_list on that.

Yes targeting wasm32 from the generated Rust is what I'm trying to achieve. However, va_list doesn't seem to be the issue here, it seemed to run fine on Wasm. It hit a wall later because of https://github.com/immunant/c2rust/issues/248.

I tried transpiling using the -m32 flag to circumvent wrong calculations for sizeof/offsetof, since x86 is the closest to wasm32.

If there was some way to control the stage that is responsible for evaluating C builtins, I would also be fine with that.

pablosichert commented 4 years ago

I think what's interesting in the diff between 64-bit and 32-bit is that on 64-bit the output is

pub type __builtin_va_list = [__va_list_tag; 1];
#[derive(Copy, Clone)]
#[repr(C)]
pub struct __va_list_tag {
    pub gp_offset: libc::c_uint,
    pub fp_offset: libc::c_uint,
    pub overflow_arg_area: *mut libc::c_void,
    pub reg_save_area: *mut libc::c_void,
}
pub type va_list = __builtin_va_list;
mut arguments: ::std::ffi::VaList

vs on 32-bit

pub type __builtin_va_list = *mut libc::c_char;
pub type va_list = __builtin_va_list;
mut arguments: va_list

Therefore a lot of methods are missing for the va_list, because it's just a *mut libc::c_char.

ahomescu commented 4 years ago

I tried transpiling using the -m32 flag to circumvent wrong calculations for sizeof/offsetof, since x86 is the closest to wasm32.

I think that's what could be causing your problems, at least for va_list. It's defined as int[4] on wasm32, which decays to int* when passed around to callees.

On the other hand, in your latest error, it seems like the problem is the code is trying to call arg on a raw pointer instead of a VaList.

ahomescu commented 4 years ago
                                          mut arguments: va_list) {

Ah there it is, arguments should be a VaList not a transpiled va_list.