immunant / c2rust

Migrate C code to Rust
https://c2rust.com/
Other
3.79k stars 219 forks source link

analyze: add pointee_type analysis #1029

Closed spernsteiner closed 8 months ago

spernsteiner commented 9 months ago

This branch adds an analysis that computes a set of possible pointee types for each pointer. It includes only the analysis; the analysis results aren't used to drive any rewriting yet. On algo_md5, the analysis correctly identifies that the _input argument to MD5_Update should actually point to u8s, that the memset in li_MD5Transform operates on u32s, and that the memset in MD5_Final operates on an MD5_CTX.

The next step after this will be to use the analysis results in rewriting. We need to rewrite pointer types to use the inferred pointees (such as changing _input: &c_void to _input: &[u8]), replace memcpy and memset with safe operations, and (optionally) delete casts that are now no-ops (such as input = _input as *const c_uchar, once input and _input are both rewritten to &[u8]). This may also require some changes to our current void* cast handling.

aneksteind commented 9 months ago

is this doing equality saturation? it looks very familiar to the approach egg takes with egraphs, so i'm curious why something like that wasn't used and why this was done by hand?

aneksteind commented 9 months ago

It's a little difficult for me to review this for correctness, especially given there are no tests. How can I be most useful to you here as a reviewer? One of my concerns is that it's all hand-made with no reference to a standard equivalence constraint solving algorithm. We do this in a few places and it seems repetitious and error-prone; please tell me if my concerns are overblown.