cucapra / filament

Fearless hardware design
http://filamenthdl.com/
MIT License
159 stars 9 forks source link

New IR next steps #153

Closed rachitnigam closed 1 year ago

rachitnigam commented 1 year ago

The new IR (#149) will allow us to simplify lots of the compiler while making it faster overall. I'm going to layout the next steps we need to do in order to rewrite the compiler to only use the new IR after the initial frontend AST construction.

IR Concepts

First, it's useful to understand the invariants of the IR and the benefits it provides.

The salient benefit from all this is that adding new features is much easier now. Here are a couple that I can think of:

  1. Local ports (wires): These are essentially the same as bundles of length 1.
  2. Local parameters: Easy to do because the IR does not care about where parameters are defined.
  3. Existential parameters: Again, much easier to do for the same reason as (2).

These design decisions have some downsides too. The biggest one is that because frontend syntax is canonicalized and all names removed, it is impossible to reconstruct source location information. As this comment mentions, the solution here is to track the source location information using another data structure altogether. This is a must if we want to report errors of the same quality as the current compiler (or better)

Type Checking & Interval Checking

As mentioned in #111, we should separate out type checking from interval checking. I suggest we adopt the following strategy: each of these is represented as a compiler pass that is responsible for adding assert statements to the program (along with information for why that assert was created). A later pass is responsible for actually discharging the asserts using a solver.

This decoupling of assertion generation and discharging has a few benefits:

  1. We can see what assertions were created after each pass (and selectively discharge them to debug problems)
  2. We can use passes like assertion hoisting (#152) to move all asserts to the top and discharge them with fewer queries. Note: The F* compiler first attempts to discharge all asserts in one go and only splits them to report errors if problems occur.

Monomorphization

This is the big one that's not yet clear to me. Monomorphization requires "evaluating" the commands of a component and generating a new component. To do this correctly, we probably need to add a bunch more methods to the IR structures but properties such as "unique names" in the IR should simplify things dramatically.

Optimizations

The big thing that the IR will eventually let us do is implement more optimizations. It's not clear to me what these will look like since there is no semantic information attached to primitives but once we go down that road, I think building new, powerful, time-sensitive optimizations will become possible.

rachitnigam commented 1 year ago

Also, for reference purposes, the high-level Filament program:

comp Shift[#W, #N]<G: 1>(
    @[G, G+1] in: #W
) -> (
    @[G+#N, G+#N+1] out: #W
) {
    bundle f[#N+1]: for<#k> @[G+#k, G+#k+1] #W;

    f{0} = in;
    for #i in 0..#N {
        d := new Delay[#W]<G+#i>(f{#i});
        f{#i+1} = d.out;
    }
    out = f{#N};
}

Looks like this in the IR:

external comp %comp0;
external comp %comp1;
external comp %comp2;
external comp %comp3;
comp %comp4[%pr0, %pr1]<%ev0: %e1>(
  %p0: for<%pr2: %e1> @[%t0, %t1] %e2
) -> (
  %p1: for<%pr3: %e1> @[%t2, %t3] %e2
) {
  %pr2 = param bundle;
  %pr3 = param bundle;
  %pr4 = param local;
  %pr5 = param bundle;
  %pr6 = param bundle;
  %pr7 = param local;
  %pr8 = param bundle;
  %e0 = expr 0;
  %e1 = expr 1;
  %e2 = expr %pr0;
  %e3 = expr %pr1;
  %e4 = expr %e1 + %e3;
  %e5 = expr %pr4;
  %e6 = expr %e1 + %e5;
  %e7 = expr 2;
  %e8 = expr %e5 + %e7;
  %e9 = expr %pr6;
  %e10 = expr %e1 + %e9;
  %e11 = expr %pr7;
  %e12 = expr %e1 + %e11;
  %e13 = expr %e1 + %e12;
  %t0 = time %ev0+%e0;
  %t1 = time %ev0+%e1;
  %t2 = time %ev0+%e3;
  %t3 = time %ev0+%e4;
  %t4 = time %ev0+%e5;
  %t5 = time %ev0+%e6;
  %t6 = time %ev0+%e8;
  %t7 = time %ev0+%e9;
  %t8 = time %ev0+%e10;
  %t9 = time %ev0+%e11;
  %t10 = time %ev0+%e12;
  %prop0 = prop false;
  %prop1 = prop true;
  %prop2 = prop %e11 >= %e0;
  %prop3 = prop %e3 > %e11;
  %prop4 = prop (%prop2 & %prop3);
  %p2: bundle(out) for<%pr5: %e1> @[%t5, %t6] %e2;
  %p3 = bundle for<%pr6: %e4> @[%t7, %t8] %e2;
  %p4: bundle(in) for<%pr8: %e1> @[%t9, %t10] %e2;
  %inst0 = instance %comp1[%e2];
  %inv0, %p2 = invoke %inst0<%t4>;
control:
  %p3[%e0..%e1) = %p0[%e0..%e1)
  for %pr7 in %e0..%e3 {
    %inst0;
    %p4[%e0..%e1) = %p3[%e11..%e12)
    %inv0;
    %p3[%e12..%e13) = %p2[%e0..%e1)
  }
  %p1[%e0..%e1) = %p3[%e3..%e4)
}
rachitnigam commented 1 year ago

Okay, I came up with what I think is a pretty clever solution to monomorphization. During monomorphization, we need to substitute the values for all parameters P1, P2, ... Pn with expressions and evaluate statements using the results. Instead of doing the standard things of keeping a binding and substituting the values for each parameters, here is what we do:

  1. Walk over all interned exprs bottom up (from the smallest index to bigger ones) and build a map from old expressions to the ones generated from substitution and evaluation
  2. Then, instead of having to evaluate anything at all, just walk the control program and replace all old expr indices with values from the map

The nice thing about this is that we are guaranteed to evaluate each expression once and we can keep the computations in a dense map which is really good for performance as well

rachitnigam commented 1 year ago

This is considered closed with #216.