stephenrkell / libcrunch

A dynamically safe implementation of C, using your existing C compiler. Tolerates idiomatic C code pretty well. Not perfect... yet.
100 stars 4 forks source link

Instrumentations should be more factored #7

Open stephenrkell opened 2 years ago

stephenrkell commented 2 years ago

A shortish-term goal is to allow many different instrumentations to be created easily. We already have this to some extent, with the bounds-checking and type-checking parts. It would be good to be able to reproduce many papers' approaches/results.

I am envisaging the following parts.

This relates to #4, in that we have to revisit our approach to packaging dependencies more broadly.

A pitch for all this is as a more accessible (simpler), stabler (less churn) and more comprehensive (source-level) research testbench than LLVM.

stephenrkell commented 2 years ago

One problem with a CIL inlinifier is that it can't do the site-specific codegen we envisage for stuff like inline caching. E.g. if we declared a static local for cache purposes, it would get scoped to the inline function whereas we want it in the caller (one per call site). This would be easy to do with a macro. I guess doing a CIL pass over the call sites is not too much bother.

stephenrkell commented 2 years ago

Another issue is our use of hot/cold path-splitting. It seems hard to make this modular, although it could be done.

One intriguing application of the hot/cold path is for speculative hoisting. Does this even make sense? E.g. given a check inside a loop, ideally we might prove the necessary conditions for hoisting the check out of the loop (into one "big range" check). These are something like:

But what if we can't prove those things? Can we do a speculative check. knowing that if it passes we are all good (fast path omits the check inside the loop body) but if it fails, we can fall back on a secondary path that does include the loop inside?

I think this kind of speculation could be good for cases of loop exiting early. It could also be good for cases where the allocation grows inside the loop, i.e. if there's a need to grow, the primary check would fail but the secondary path would be fine.

I can envisage a combined pass that does the hot/cold split and loop hoisting together. Much harder to see a way to separate / parameterise those. I think it would be fine to do the combined pass, though.