feroldi / oxide

Regionalized Value State Dependence Graph written in Rust
MIT License
9 stars 0 forks source link

LykenSol/vsdg has started work on a VSDG implementation in Rust #1

Open WildCryptoFox opened 5 years ago

WildCryptoFox commented 5 years ago

https://github.com/LykenSol/vsdg/

Just a quick FYI in case you've not seen it. I'm looking forward to some VSDG based compilers in Rust. May experiment with this myself. +star'd

eddyb commented 5 years ago

Sadly that repo is abandoned, we'll probably scrap it and try again (if ever). Using a mutable graph turned out to be a constant pain, and we're still unclear on what the well-formed-ness conditions should be, given the papers we've looked at. (e.g. what disallows one arm of a γ node to refer to nodes from the other arm?)

RVSDG seems easier to reason about and I've been playing a bit with interning graphs lately, although I haven't gone for RVSDG exactly just yet.

feroldi commented 5 years ago

Using a mutable graph turned out to be a constant pain

@eddyb Could you elaborate on that? I'm going for the mutable graph approach, but I'm not sure which issues it brings in.

eddyb commented 5 years ago

It's been a while, but there are at least two issues:

That said, it's possible only some, or even none, of this applies to you, so feel free to ignore all this.

My experiments more recent than our VSDG attempt have put interning in a very favorable light, but I haven't yet encoded actual control-flow in an interned IR, so there's also that.

feroldi commented 5 years ago

What would interning mean in that context? I've searched it, but I haven't found anything.

Regarding mutability, a graph rewriting algorithm that takes an in arena graph and writes the new, transformed graph into another arena could resolve the issues with well-formedness. I'm not sure about the applicability of this approach, though.

On Tue, Aug 20, 2019, 06:27 Eduard-Mihai Burtescu notifications@github.com wrote:

It's been a while, but there are at least two issues:

  • being able to have input ports disconnected
    • you can get rid of this by always creating nodes from their node type/operation/etc. and their inputs
  • even if you have correct-by-construction graphs, being able to mutate them in-place (as opposed to creating new modified graphs) makes that much harder to reason about, IMO
    • among other things, you likely end up needing to track everything that uses an output port, which leads to what one might call "context-sensitive reasoning"

That said, it's possible only some, or even none, of this applies to you, so feel free to ignore all this.

My experiments more recent than our VSDG attempt have put interning in a very favorable light, but I haven't yet encoded actual control-flow in an interned IR, so there's also that.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/feroldi/oxide/issues/1?email_source=notifications&email_token=ACB4BHSGKIQY37XQ75SLN3LQFO2HXA5CNFSM4IJLQMT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4VVHAQ#issuecomment-522933122, or mute the thread https://github.com/notifications/unsubscribe-auth/ACB4BHX3CRZOZK6Y5MFL4CTQFO2HXANCNFSM4IJLQMTQ .

eddyb commented 5 years ago

Interning is where you only allocate a value if it hasn't seen before, otherwise you reuse the existing one. So it's usually an allocator (an arena or even just a Vec - where indices take the place of pointers) plus a cache (i.e. some sort of map/set that lets you look up the interned pointer/index given an uninterned value, and therefore avoid allocating it again). This deduplication also means that you can do a trivial comparison (of the interned pointer/index) to check if two (potentially very deep) values are equal.

This is mostly used for strings, but a more advanced example is how the Rust compiler represents types.

Because of interning, types like e.g. Vec<Option<(u32, bool)>> and [Option<(u32, bool)>] will always share the u32, bool, (u32, bool) and Option<(u32, bool)> "nodes", even if created separately, from scratch (and we rarely think of them as "tree nodes").

In a more formal sense, these are pure terms, sort of like this:

let ty0 = Primitive("u32");
let ty1 = Primitive("bool");
let ty2 = Tuple([ty0, ty1]);
let ty3 = Adt("Option", [ty2]);
let ty4 = Adt("Vec", [ty3]);
let ty5 = Slice(ty3);

If this looks a bit familiar, it's because this is very much like SSA! But like a pure functional lambda calculus, there are no side-effects.

The main reason you don't tend to have interning used for SSA, is the CFG that tends to exist around it - instead, you have things like GVN, which have to be careful to not break anything involving side-effects.

(R)VSDG, OTOH, has a much better chance at relying on interning: the side-effects are encoded entirely in how "terms" refer other "terms" (i.e. nodes and their ports), so:

feroldi commented 5 years ago

Quite insightful. IIRC, Clang does something similar with its type system as well.

How would commutative operations work for interning? Would it generate the same hash for two commutative nodes whose operands are the same but in reversed order?

eddyb commented 5 years ago

That's an interesting question. I hadn't considered commutative hashing until now. From my discussions with @eternaleye, one of the best known ways to approach the fact that many IR graphs are equivalent is eqsat - but I'm not aware of anyone combining that with (R)VSDG.

eternaleye commented 5 years ago

Commutative hashing (and associative hashing) are problematic for any use case where you might get adversarial input, because adding mathematical properties to your hash function essentially gives adversaries the ability to solve for colliding inputs efficiently. Some properties make it easier than others, but e.g. commutative + associative together trivially gives the result from this SE answer:

If $H(a,a) = b \ne a$, then we have $H(a,b) = H(a,H(a,a)) = H(H(a,a),a) = H(b,a)$ (because of associativity), and hence we found a commuting pair.

"Adversarially-triggered miscompilation" is a phrase I really don't want to have to think about - if someone figures out how to generate targeted collisions to your password comparison function, so they can attempt to replace it with a backdoored version, things go very wrong very quickly.

Furthermore, in my opinion interning should only ever be treated as a strictly internal space-saving representational change, and not an externally-visible identity of nodes. The reason is that what transformations should be applied to a node (during optimization) depends on its context and its structure. If interning is exposed as an identifier, and each identifier is optimized only once, then that optimization is necessarily ignorant of context - and thus, suboptimal.

Eqsat avoids this by explicitly representing all equivalent IR forms, and doing so in a contextually-sound manner.

There's actually a very easy way to fit eqsat into VSDG, and thus (by extension) RVSDG. For VSDG:

  1. Extend the definition of a node s.t. every inflow port may accept any cardinality of incoming edges
  2. All incoming edges to a single port are constrained to produce the same value in all circumstances
  3. Eqsat optimization can now proceed by selecting a subgraph, optimizing it, and splicing it back into the VSDG

For RVSDG, one can note that the above change to the graph definition is (essentially) the addition of variadic gamma nodes, whose predicate is irrelevant. By doing exactly this, one arrives at a representation that is, itself, a valid RVSDG - the only difference is that there is now a class of "optimization predicates" that, given a complete assignment of, selects a single optimized RVSDG equivalent to the original program.

Eqsat then proceeds by transforming an RVSDG selected by fixing such a set of parameters, then splicing it back into the "whole" RVSDG, inserting optimization gammas as appropriate. This proceeds until it reaches a fixed point (or until the optimization budget is exhausted), after which each assignment of optimization-gammas is evaluated for optimality. The optimal RVSDG is chosen from that set, and then compilation continues.