exo-lang / exo

Exocompilation for productive programming of hardware accelerators
https://exo-lang.dev
MIT License
293 stars 28 forks source link

New Analysis #486

Open skeqiqevian opened 1 year ago

skeqiqevian commented 1 year ago

Someone else should elaborate on this... for now I'll just tag a bunch of relevant issues that might affect decision making: #102, #127, #174, #287, #294, #314, #377, #390

yamaguchi1024 commented 12 months ago

Below is the discussion we (I mean, 100% written by @gilbo) had, which previously existed as a comment in dataflow.py. Putting it here for a better reference in the future.

Big Question: How do we represent the result of dataflow analysis?

We are taking Option1.

New compilation flow is as follows:

pyAST  -->  UAST
        --[type checking, inference, bounds checking]-->  [LoopIR]
        --[backend checks]-->  C-code as strings

[LoopIR]  --[Primitive Transformation]-->  [LoopIR]
 As part of Primitive Transformation:
      [LoopIR]  --[Dataflow analysis]-->  "AST annotated w/dataflow results"
                       --[modified new_eff_analysis]-->  AExpr/Query (Discharged to SMT)

Below are Gilbert's notes to explain Abstract Interpretation to me.

What is Lattice?

A Lattice is a set X equipped with

Satisfying the following axioms:

Consequences

Examples:

Def. Lattice Homomorphism:

The Abstract Interpretation Algorithm

Propagate abstract values through statements in any order. This will compute a fixed-point assignment of variables to abstract values at every program point. This algorithm will terminate if the abstract lattice has a finite height.

P(V_1 x V_2 x ...) I have a concrete domain of values V. Define the "Concrete" lattice for V to be Powerset(V) (note every powerset is a lattice, with subset as partial order, union as join, and intersection as meet)

An abstraction of V (eqv. Powerset(V)) is a lattice A s.t. we have two lattice homomorphisms: abstraction (abs : P(V) -> A) concretization (conc : A -> P(V)) satisfying a property: namely that they are "adjoint" in the following sense abs(conc(x)) = x conc(abs(x)) <= x

A product lattice of lattices A and B is given as

Can we form a product abstraction? Well, we have some V which A abstracts and some W which B abstracts? No, we're actually interested in the case where A and B both abstract V. So, what we are actually interested in isn't a totally arbitrary product...

Product Abstraction (on a common concrete domain V) Given abstractions (A,absA,concA) and (B,absB,concB) both of V, Construct a product abstraction:

conclusion 1: any (bot,_) or (_,bot) must be bot in A x B! i.e. we cannot simply use a product lattice conjecture: this occurs for any (singleton) set, not just bottom

Major Conclusion:

Btw, suppose I have P(X) the power set of X. Using "type theory" notation, I can also talk about the set of all functions X -> Bool, or all predicates on X (e.g. in Coq one would write X -> Prop) All of these are (ignoring stupid logical foundation issues) essentially the same. Why? Well, consider some S in P(X) S is a subset of X; define f_S(x) = "x in S". Similarly, using set comprehension and given f : X -> Bool, define S_f = { x | f(x) }

(one more thing for fun) X -> P(Y) ~== X -> Y -> Bool ~== X x Y -> Bool ~== P(X x Y) ~== Rel X Y

A program is a control-flow-graph, and btw, let's fission each basic block into invidividual SSA statements (including unique variable names, and phi nodes)

A program point is basically an edge between statements in this CFG In some sense, program points are the real "states" and the statements are transitions.

More precisely the state of my machine is (PC (i.e. program point), stack (i.e. variable environment mapping))

Abstract States (PC, stack but values from A instead of from V)

How do I abstract a statement (i.e. function) y = f(x) Well, first x is now a abstract value, so concretize it to a set of values Then, we know how to map each individual value with f This produces a set of values for y Then re-abstract this set In other words... _y = _f(_x) = abs({ f(x) | x in conc(_x) }) That's a definition; we need to work it out for any given choice of language (i.e. statements/functions f_i) and choice of abstract lattice (i.e. A, abs, conc)

How do we abstract multiple incoming edges to some program point? (i.e. how do we abstract phi nodes) Answer: phi is join. done.

Abs = V U {top, bot} -- only decision we've made Conc = P(V) abs : Conc -> Abs conc : Abs -> Conc s.t. abs(conc(x)) = x conc(abs(X)) >= X

What is the specific definition of conc? conc(v) = {v} for v in V conc(bot) = {} conc(top) = V

yamaguchi1024 commented 12 months ago

Also, we have a limited version of dataflow analysis in LoopIR_dataflow.py, which usage probably should be replaced by DataflowIR once it's done.

yamaguchi1024 commented 12 months ago

Discussion topic

gilbo commented 11 months ago

@yamaguchi1024 These are more of your meeting notes and not really a description of a project plan for how to move forward on "new analysis" at a high level.