immunant / c2rust

Migrate C code to Rust
https://c2rust.com/
Other
3.99k stars 240 forks source link

Avoid introducing typecasts from C `char` to Rust `libc::c_int` in comparisons to literals #622

Closed 64kramsystem closed 2 years ago

64kramsystem commented 2 years ago

In the source codebase I'm porting, there are char variables, that are compared to char literals.

In the transpiled codebase, the compared values are (explicitly) casted to i32, which is redundant.

Now, I can imagine the reasons why this happen - if C2rust works on some IR, on a modern architecture, comparing a 32 bit value may be preferrable to an 8 bit one, however, in the context of transpiling a project, faithfulness is paramount.

Reference code follows.

Original:

// function prototype
int get (void);

// Global
char ch;

// Code of interest
do {
  ch = get();
} while (ch<'0' || ch>'9');

Translation (slighly simplified):

// Global
static mut ch: libc::c_char;

// function signature
fn get() -> libc::c_int;

// Code of interest
loop {
    ch = get() as libc::c_char;
    if !((ch as libc::c_int) < '0' as i32 || ch as libc::c_int > '9' as i32) {
        break;
    }
}

Ideal code of interest

loop {
    ch = get() as libc::c_char;
    if !(ch < '0' as i8 || ch > '9' as i8) {
        break;
    }
}
kkysen commented 2 years ago

I believe the reason this is done is because character literals in C, like '0', have type int. Thus, '0' in C is transpiled to '0' as i32 (I'm not sure why that's not '0' as libc::c_int). Then to do the comparison, we also cast the LHS to libc::c_int (sign extension of char widening in C is implementation-defined, and we just use the Rust as implementation here).

64kramsystem commented 2 years ago

Right!

kkysen commented 2 years ago

I do think ch < '0' as libc::c_int would be better and equivalent in this case, but that might be much easier to refactor outside of the transpiler.