Revise the IR implementation

From implementing the EQY engine on top of the freshly implemented IR data structures, I've gathered some experience in using it and while I think I got quite a few things right and feel like the general concept of an e-graph based IR is delivering on the things that made me go for that design, there are some details about the implementation and API that I want to revise significantly before adding too much more code depending on it.

An incomplete list of things that should not change:

being e-graph based
being able to efficiently store fine-grained/bit-level and coarse-grained/word-level designs
- including both simultaneously with signal correspondence between them
supporting custom user defined node types
supporting certain automatic normalizations during e-graph rebuilding
- including constant propagation, so that user code never has to deal with nodes having constant inputs
have infrastructure for maintaining an acyclic subset of primary definitions (or guarded cycles only in case of sequential circuits)
- but not necessarily an IR-global single selection of primary definitions as is currently maintained

This should also ensure that any existing code can be kept working with reasonable effort.

This is a currently incomplete list of changes to be made or issues to address:

[ ] TODO move the more complete items below into their own issues
[ ] getting rid of the current term/node distinction / supporting nodes without an output
- the idea behind this originally was that I wanted to consistently reduce every occurrence of a constant value (in a combinational design), making the absence of constant values an invariance IR users can depend on. An and-gate with a constant false output, would then reduce to a binary clause node on its two inputs.
- after some experience using the IR, I think the no-constant invariant would be sufficiently useful if restricted to node inputs, and unconditionally knowing that a node has an output would remove a lot of unwrap() calls where the code knows that it is dealing with a TermNode but the the used IR API also has to handle non-Term Nodes.
- always representing constraints by having a Boolean output with an equivalence between that output to true also has some major advantages
  - it would mean that every constraint is represented by an equivalence, making it easier (or even possible) to write code that handles all constraints generically
  - makes it possible to write generic code to reify constraints by replacing known equivalences with comparisons that produce an output indicating whether an equivalence holds
    - this is useful since e.g. unrolling a design for induction can turn an equivalence that's known to hold unconditionally on the combinational design into an assumption that could fail for unreachable input states and thus unrolling should turn equivalences into comparisons
    - current unrolling code in the EQY engine only uses the primary definition of a signal, discarding any invariants (equivalences or binary clauses) since there was no easy way to uniformly deal with this
[ ] e-graph rebuilding with automatic normalization should never automatically remove nodes
- currently reducing nodes during e-graph rebuilding e.g. for constant propagation will remove the original node
- the upside is that this avoid accumulating useless nodes with constant inputs, but the downsides are
  - that it makes the rebuilding code a lot more complicated than it needs to be
  - that it makes it really hard to write correct code that tracks any per node information (in contrast to per variable information which works well and is what existing code does)
- instead it should be sufficient to have a way to mark a node as redundant with an easy way to batch remove all redundant nodes at points where user code deems it safe to do so
[ ] automatic normalization during rebuilding should be optional and/or user-configurable
- [ ] TODO come up with a design
[ ] replace the dyn / vtable based dynamic typing for nodes with a more "serialization/deserialization" based approach
- it should still be possible to dynamically add new user-specific node types, this is just about the implementation technique and the API design
- the low-level node storage would still be implemented essentially the same way with pages of nodes of an uniform type and linear node-ids across those pages, so that
  - node access is fast
  - storage overhead is minimized
- while the dyn / vtable implementation has been a great and in particular flexible infrastructure to have while coming up with an initial IR design, I see the following major limitations:
  - it requires using dynamic casts all over the place, where pattern matching would feel much more natural
  - while it's extensible in adding new node types, the operations supported on those nodes have to be part of the core IR traits
    - any user specific operations have to be implemented on top of those generic operations or exhaustively handle all supported nodes via dynamic casts
- an alternative design that I'm currently trying to work out would instead represent nodes using a dynamic AST (think of JSON but with types like variables, literals, unordered-sets, opaque-ids/names)
  - here a page would then store nodes that have the same shape, with the shape being stored in the page metadata and the per-node storage would still consists of the data that varies per node (i.e. wouldn't increase from the current approach)
  - there would be good API support to efficiently convert this to and from user specified types (for types that have a fixed shape, this shouldn't have more overhead than current dynamic casts)
  - the same API could support multiple user types covering the same AST shape, in particular also enum types over multiple primitive nodes, such that code expecting combinational bit-level nodes could access those as an enum covering And/Xor/... and then use actual pattern matching (if there's any difference at all I'd expect a good implementation to have better performance than the sequence-of-dyn-casts currently required)
[ ] improved change tracking
- potentially related to user-configurable automatic normalization
- [ ] TODO describe the current change tracking, limitations and come up with improvements
[ ] improved and/or user-configurable indexing
- initially I tried to make this configurable using generic parameters on the IR, but that got unwieldy very quickly
- solving improved change tracking and user-configurable automatic normalization, should also make it clear how to solve user-specific index maintenance
- independent of that I'd really like to have a way to do lookups by definition and use, which on its own will always be builtin as it is required for egraph maintenance, but restricted to certain node types/shapes
  - I haven't been able to come up with a design for this that would have acceptable memory and runtime overhead to be unconditionally maintained, but I'm also not convinced yet that there is no way to do so
  - one idea would be to have a built in layer functionality where nodes can be added to one or more layers and per layer indices are maintained in addition to the global index, this would allow for having a "word-level" layer and a "bit-level" layer, but also short-lived ad-hoc layers to store metadata, tracking labels, etc.
[ ] improved primary definition maintenance
- while I'm still fully behind the concept of acyclic primary definitions, currently there's a single IR-global primary definition for each defined variable, which is awkward and insufficient when different representations of the circuit are stored in the same egraph, like
  - word-level + bit-level
  - gate based + LUT based
  - gate based but with two different subsets of primitive gates (AIG only vs all different 2/3-input gates)
  - different FF restrictions (AIGER vs arbitrary combinational initialization)
- [ ] TODO come up with a better design
[ ] proof tracing capability for e-graph rebuilding and automatic normalization
- even without any near term plans to actually implement checkable proof certificates, we should make sure that the infrastructure to trace modifications to the IR graph (along with their justification) is in place
- some of the changes above (like never implicitly deleting nodes) will make this easier than it would be for the current implementation

YosysHQ / imctk

Revise the IR implementation #6