immunant / c2rust

Migrate C code to Rust
https://c2rust.com/
Other
3.99k stars 238 forks source link

c2rust translated rust code gives different output from gcc and clang #1112

Closed Yeaseen closed 2 months ago

Yeaseen commented 2 months ago

Input C code (test.c)

#include <stdio.h>

int globe = 35;

int* fn1(int strd) {
    static int result; 
    result = globe;
    if (strd > 2) {
        result += 4;
    }
    result += strd;
    return &result; 
}

int main() {
    char q_q = 34;
    int p_p = 110; 
    p_p += q_q ^ p_p || globe; 
    q_q = globe ^ p_p ^ 25;
    printf("%d\n", *fn1(q_q));
    printf("%d\n", *fn1(p_p));
    int final=*fn1(q_q) += *fn1(p_p); 
    printf("Result: %d\n", final ); 
    return 0;
}

GCC+Clang Output with UBSan+Asan

Screenshot 2024-08-24 at 4 58 09 AM

Both GCC and Clang give the same output

Translated Rust code's output

Screenshot 2024-08-24 at 5 08 22 AM

Translated Rust Code

#![allow(dead_code, mutable_transmutes, non_camel_case_types, non_snake_case, non_upper_case_globals, unused_assignments, unused_mut)]
use ::transpiled_code::*;
extern "C" {
    fn printf(_: *const libc::c_char, _: ...) -> libc::c_int;
}
#[no_mangle]
pub static mut globe: libc::c_int = 35 as libc::c_int;
#[no_mangle]
pub unsafe extern "C" fn fn1(mut strd: libc::c_int) -> *mut libc::c_int {
    static mut result: libc::c_int = 0;
    result = globe;
    if strd > 2 as libc::c_int {
        result += 4 as libc::c_int;
    }
    result += strd;
    return &mut result;
}
unsafe fn main_0() -> libc::c_int {
    let mut q_q: libc::c_char = 34 as libc::c_int as libc::c_char;
    let mut p_p: libc::c_int = 110 as libc::c_int;
    p_p += (q_q as libc::c_int ^ p_p != 0 || globe != 0) as libc::c_int;
    q_q = (globe ^ p_p ^ 25 as libc::c_int) as libc::c_char;
    printf(b"%d\n\0" as *const u8 as *const libc::c_char, *fn1(q_q as libc::c_int));
    printf(b"%d\n\0" as *const u8 as *const libc::c_char, *fn1(p_p));
    let ref mut fresh0 = *fn1(q_q as libc::c_int);
    *fresh0 += *fn1(p_p);
    let mut final_0: libc::c_int = *fresh0;
    printf(b"Result: %d\n\0" as *const u8 as *const libc::c_char, final_0);
    return 0 as libc::c_int;
}
pub fn main() {
    unsafe { ::std::process::exit(main_0() as i32) }
}

Warning while cargo build

I found this warning while building the above translated code by c2rust

warning: creating a mutable reference to mutable static is discouraged
  --> src/runner.rs:16:12
   |
16 |     return &mut result;
   |            ^^^^^^^^^^^ mutable reference to mutable static
   |
   = note: for more information, see issue #114447 <https://github.com/rust-lang/rust/issues/114447>
   = note: this will be a hard error in the 2024 edition
   = note: this mutable reference has lifetime `'static`, but if the static gets accessed (read or written) by any other means, or any other reference is created, then any further use of this mutable reference is Undefined Behavior
   = note: `#[warn(static_mut_refs)]` on by default
help: use `addr_of_mut!` instead to create a raw pointer
   |
16 |     return addr_of_mut!(result);

Modified Code to suppress the warning

To suppress the warning, I modified the translated rust code. Now, there's no warning. Still, the same output before the modification. and the output is still different from what GCC and Clang give.

#![allow(dead_code, mutable_transmutes, non_camel_case_types, non_snake_case, non_upper_case_globals, unused_assignments, unused_mut)]
use ::transpiled_code::*;
use std::ptr::addr_of_mut;

extern "C" {
    fn printf(_: *const libc::c_char, _: ...) -> libc::c_int;
}

#[no_mangle]
pub static mut globe: libc::c_int = 35 as libc::c_int;

#[no_mangle]
pub unsafe extern "C" fn fn1(mut strd: libc::c_int) -> *mut libc::c_int {
    static mut result: libc::c_int = 0;
    result = globe;
    if strd > 2 as libc::c_int {
        result += 4 as libc::c_int;
    }
    result += strd;
    return addr_of_mut!(result);
}

unsafe fn main_0() -> libc::c_int {
    let mut q_q: libc::c_char = 34 as libc::c_int as libc::c_char;
    let mut p_p: libc::c_int = 110 as libc::c_int;
    p_p += (q_q as libc::c_int ^ p_p != 0 || globe != 0) as libc::c_int;
    q_q = (globe ^ p_p ^ 25 as libc::c_int) as libc::c_char;
    printf(b"%d\n\0" as *const u8 as *const libc::c_char, *fn1(q_q as libc::c_int));
    printf(b"%d\n\0" as *const u8 as *const libc::c_char, *fn1(p_p));
    let ref mut fresh0 = *fn1(q_q as libc::c_int);
    *fresh0 += *fn1(p_p);
    let mut final_0: libc::c_int = *fresh0;
    printf(b"Result: %d\n\0" as *const u8 as *const libc::c_char, final_0);
    return 0 as libc::c_int;
}

pub fn main() {
    unsafe { ::std::process::exit(main_0() as i32) }
}

Output after Modification

This time cargo build doesn't show the previous warning.

Screenshot 2024-08-24 at 5 18 22 AM

Possible Root cause

The likely cause of the discrepancy is the order of evaluation of result.

Yeaseen commented 2 months ago

c4go gives the same output as gcc and clang do. So, the error is in c2rust

kkysen commented 2 months ago

Might this be due to the order of operations in an expression (for final) being unspecified? Rust does fn1(q_q) first, but C may do either first, and it might be doing fn1(p_p) first.

Yeaseen commented 2 months ago

I cross-checked this C code with c4go which preserves the C order and gives the same output as GCC and Clang do.

I found the order in that translated go code by c4go is the same as what c2rust gives here.

var final int32 = func() int32 {
        tempVar2 := &(fn1(int32(q_q)))[0]
        *tempVar2 += (fn1(p_p))[0]
        return *tempVar2
    }()
noarch.Printf([]byte("Result: %d\n\x00"), final)

I believe C has nothing to do with this, as I tested GCC and Clang with UBSan and Asan. No run time error was found.

Not sure about what error in c2rust is causing this divergence.

Yeaseen commented 2 months ago

As far as I am concerned, c2rust uses Clang-based LibTool as a front-end for the C input. It should even be c2rust's advantage to preserve this C operation. Yet, this divergence in output. The root cause is still unclear to me. If any direction, that would be great.

kkysen commented 2 months ago

I believe C has nothing to do with this, as I tested GCC and Clang with UBSan and Asan. No run time error was found.

Order of evaluation is unspecified, not undefined. So it won't be flagged by UBSan, but it gives unspecified results.

Yeaseen commented 2 months ago

yeah. Specification might be an issue. But I think Rust is doing 150 + 150 instead of 124 + 150.

In the translated rust code, the following modification gives the perfect output 274

//*fresh0 += *fn1(p_p); // I believe this part is buggy

*fresh0 = *fresh0 + *fn1(p_p);

I think this needs special care to preserve equivalence. The same bug is also found in CxGo

thedataking commented 2 months ago

@kkysen has already provided a satisfactory explanation for the results you observe @Yeaseen:

Order of evaluation is unspecified, not undefined. So it won't be flagged by UBSan, but it gives unspecified results.

You can read more about the order of evaluation in C here https://en.cppreference.com/w/c/language/eval_order.

I've copied your code example in to godbolt.org. Depending on what compiler you chose, the result of the computation, the result is either 274 or 300. For x64 msvc v19, it computes 300 just like the Rust compiler.