[LSQ] Formalize and modularize LSQ placement logic

This is a semi-formal specification for the formalization and modularization of the LSQ placement logic, which happens before and up to cf-to-handshake conversion. We outline the necessary changes (along with their respective purposes) to achieve a flexible yet robust implementation. The overarching goal is to defer all decisions related to memory interface placement prior to the cf-to-handshake conversion pass and store them in attributes on IR operations. The conversion pass, meanwhile, should only perform sanity checks (making sure the LSQ placement specified by the IR is possible and makes sense) and connect memory interfaces to control/access ports as required.

We want complete control over the content of the groups that each LSQ will be instantiated with. Currently, and identically to the legacy implementation, every block is assumed to be its own group, however it doesn't have to be this way. We need memory access ports (i.e., load/store-like MLIR operations) to specify the group which they belong to (and whether they should be connected to an LSQ at all). We propose a single operation attribute handshake.mem_interface with an optional integer parameter to encode this information. Below are a couple examples (note that the specific memory interface each attribute references is implicitly given by the memory region given to its owning operation, so it doesn't need to be contained in the attribute itself).

// This port will be connected to group 0 of its memory region's LSQ
%loadData1 = memref.load %memRegion[%loadAddr1] {mem = #handshake.mem_interface(LSQ: 0)} : memref<SIZExi32>
...
// This port will be connected to its memory region's MC (no group number was provided)
%loadData2 = memref.load %memRegion[%loadAddr2] {mem = #handshake.mem_interface(MC)} : memref<SIZExi32>
...
// This port will be connected to group 1 of its memory region's LSQ (note that the LSQ will be different from the first example, since the memory region is different)
memref.store %storeData3, %otherMemRegion[%loadAddr3] {mem = #handshake.mem_interface(LSQ: 1)} : memref<SIZExi32>

Ultimately, these attributes will be set by our polyhedral memory analysis. In the meantime though, we can repurpose the recently introduced --force-memory-interface pass so that, when asked to connect every port to an LSQ, it groups accesses exclusively with respect to their basic block.

The cf-to-handshake pass should then perform the following steps to correctly instantiate all memory interfaces for the circuit.

Verify that all memory accesses are annotated with the handshake.mem_interface attribute. This way there is no default "hidden" behavior and it is harder to make a mistake.
Collect all memory accesses for each given memory region.
For each memory region, if any of its memory accesses need to connect to an LSQ, perform the following sanity checks. a. For each group, verify that dominance makes sense (use MLIR's existing dominance analysis). There must be a clear dominance relationship between the blocks containing the accesses belonging to the group. The pass should also be able to identify the "first block" among those that dominates all others and which will supply the group's control signal. b. Verify that groups make sense with respect to each other. Two groups cannot be triggered by the same control signal.
For each memory region that has an LSQ, establish the port ordering within each group (derivable from the dominance analysis in 3a and program order).
Instantiate memory interfaces as needed.

Some (relatively light) backend changes are also necessary to implement these features but for brevity we don't describe them here (i.e. the internal representation of memory ports needs to be somehow "relaxed").

EPFL-LAP / dynamatic

[LSQ] Formalize and modularize LSQ placement logic #52