calyxir / calyx

Intermediate Language (IL) for Hardware Accelerator Generators
https://calyxir.org
MIT License
496 stars 50 forks source link

Redesign `ControlID` to provide a better Interface #1366

Open calebmkim opened 1 year ago

calebmkim commented 1 year ago

Overview Currently, the ControlID pass provides methods that can label the control stmts with u64's, and we can query for the label of a given control stmt. We want to improve this analysis to provide a better interface for the user.

There are two main ideas discussed in #1358. The first is a more minor change to improve the interface of ControlID, the second is a more major change to get ControlID to actually build a full CDFG.

Improve ControlID Interface to Give Tokens instead of u64's

ControlID should produce a CFG

rachitnigam commented 1 year ago

Thank you for the super awesome writeup @calebmkim!

One thing to note is that sometimes, we might sometimes want different "labels" for different tokens.

This is something I've been pondering as well–we use the controlID stuff in slightly, but importantly, different ways in various passes like TDCC (which, AFAIK, doesn't even use ControlID). I think a good question is what would be a good representation/set of node ID that works for every pass?

ControlIDshould take in a control block, and internally build a full CFG.

We can even do one step further and change ControlID::new to take full ownership of the ir::Control struct. This means that instead of indirectly referencing to nodes in a modified ir::Control, the user must "detach" the control program and give it to the Analysis. This way, we can make queries like "give me an iterator over the control program in the order of the control flow edges".

Once a pass is done with it's CFG-based computation, we can use a ControlID::take function to return the original control program.

The big thing to note about this design point is that it doesn't let you add anything to the CFG, just query it and modify the groups underneath it. However, it is still powerful for analyses. Maybe we can cleverly design a Visitor-like interface for CFG-based passes that gives us control-flow order of things as well as the ability to modify it

One thing to add to this proposal is some notion of data flow/dependency: this could be helpful if we want to do some sort of "unsharing" in order to increase parallelization.

Yes! In fact, the current unsharing pass does something similar in computing a CFG but does so in a hacky way. @EclecticGriffin might have more insight on how exactly it works.

CDFGs, from what I understand, have basic blocks (i.e., blocks of instructions that you know they are just going to always just execute sequentially

I think we can use groups in place of basic blocks. Also, it is not necessary for us to build a CDFG for the kinds of passes we perform today but maybe useful to think about in the future.

sampsyo commented 1 year ago

Yes indeed; thanks for the really useful writeup, @calebmkim!

I think you did a good job mapping out what it would take to approach either of these two ideas. Perhaps it goes without saying, but this discussion also implicates the idea of "flattening" the entire IR so that integer IDs are already available (they are the way you identify control nodes), as in #1183 and friends.

I think it would be good to direct the discussion toward deciding what we will do: one of these, both, or some kind of intermediate point? Maybe a way to make that decision would be to resolve this question from @rachitnigam:

I think a good question is what would be a good representation/set of node ID that works for every pass?

Could we do an audit of the way in which these IDs are use so we can confidently describe what all these passes want from a set of IDs?