immunant / c2rust

Migrate C code to Rust
https://c2rust.com/
Other
3.91k stars 229 forks source link

analyze: generate shims for calls from non-rewritten to rewritten code #939

Closed spernsteiner closed 1 year ago

spernsteiner commented 1 year ago

Implements generation of unsafe shims when calling rewritten code from non-rewritten code. Given this input:

fn bad(x: *const i32) {
    good(x);
    // Do some unsupported operation
}

fn good(x: *const i32) {
    // ...
}

The tool now produces output like this:

fn bad(x: *const i32) {
    good_shim(x);
    // Do some unsupported operation
}

fn good(x: &i32) {
    // ...
}

unsafe fn good_shim(x: *const i32) {
    good(&*x);
}

Here, the tool has rewritten good to change its argument type from *const i32 to &i32, but analysis failed on bad, so the tool can't apply corresponding rewrites in the body of bad. Previously, this would cause a type error: bad would continue passing *const i32 to good, but good now expects &i32. Now, the tool generates a helper function good_shim, which wraps good but keeps the original signature from before rewriting, and rewrites bad to call good_shim instead of good, resulting in a well-typed program. Renaming the callee in bad doesn't require any analysis results, so it works even though analysis failed on bad.

Implementation strategy: for each failed (non-rewritten) function, we walk over the HIR and look for ExprKind::Paths and ExprKind::MethodCalls that resolve to rewritten functions. For each one, we generate a rewrite that appends _shim to the end of the function name and record the DefId of the callee. Then, for each callee we encountered, we generate a rewrite that inserts the shim function definition after the definition of the callee. The shim function has to cast arguments and return values between the original unsafe types and the safe types produced by rewriting; for this, we reuse the cast-generation machinery from rewrite::expr::mir_op and rewrite::expr::convert.

This is a counterpart to #936: that one handles calls from rewritten code into non-rewritten code, and this one handles the reverse.

This PR also marks failed fns as FIXED and thus doesn't propagate analysis failures to callers.

spernsteiner commented 1 year ago

This code uses the same cast-generation logic as mir_op, so whatever's supported there will also be supported here. For *mut T -> &[T] in particular, that cast is not supported, so shim generation will fail (as in #950).

kkysen commented 1 year ago

Can I take a little bit more of a look at this after it's rebased?