llvm / circt

Circuit IR Compilers and Tools
https://circt.org
Other
1.67k stars 298 forks source link

[HW] Dead values with cyclic dependencies are not removed by DCE #3135

Open uenoku opened 2 years ago

uenoku commented 2 years ago

Consider the following IR. This is a simple flip flop and it is dead (%o1 and %o2 are never connected to other values). DCE (circt-opt -canonicalize) cannot remove them because their uses are hold by each other.

hw.module @Dead_FF(%clk : i1, %D: i1) -> () {
  %c1 = hw.constant 1 : i1
  %Dn = comb.xor %D, %c1 : i1
  %t1 = comb.and %Dn, %clk : i1
  %t2 = comb.and %D, %clk : i1
  %u1 = comb.or %t1, %o2 : i1
  %u2 = comb.or %t2, %o1 : i1
  %o1 = comb.xor %u1, %c1 : i1
  %o2 = comb.xor %u2, %c1 : i1
  hw.output 
}

Current output:

module Dead_FF( // foo.mlir:1:1
  input clk,
            D);

  wire _GEN;    // foo.mlir:10:8
  wire _GEN_0;  // foo.mlir:9:8

  assign _GEN_0 = ~(~D & clk | _GEN);   // foo.mlir:4:8, :5:8, :7:8, :9:8, :10:8
  assign _GEN = ~(D & clk | _GEN_0);    // foo.mlir:6:8, :8:8, :9:8, :10:8
endmodule
dtzSiFive commented 2 years ago

FWIW, looks like the canonicalizer has a DCE/liveness-analysis that's gated on region-simplify that can remove this, here's the output with -canonicalize=region-simplify :

module Dead_FF( // <stdin>:2:3
  input clk,
        D);

endmodule

Did you have an approach in mind for resolving this?

Thinking out loud a little: Regarding the region simplification.... I'm not sure offhand why we disable it (in firtool, for example)-- performance or even safety concerns? If safety, maybe we can fix it. If performance, unless it's known to definitely be unreasonably expensive, might be worth some measurements (e.g., compile-time time/memory overall/breakdown, statistics on what's removed/impact on pipeline) and regardless maybe it can be enabled for one but not all of the instances in the current pipeline, or at certain optimization levels when we have those. Point is we have options. Hmm 🤔 . I don't know why an iterative dataflow would be too expensive re:performance, but maybe I'm missing something.

Anyway, that functionality is exposed in RegionUtils.h's runRegionDCE (or simplifyRegions, which does more things too, for better or worse) if we can think of a good spot to drive that from.

Or we roll our own thing :grin:. Thoughts?

uenoku commented 2 years ago

Ah, interesting. Thank you for finding this out! I didn't expect region-simply can eliminate cyclic stuffs. Originally region-simplify has been disabled in firtool since https://github.com/llvm/circt/commit/cb0899d6e7ab0ceacec3a1debe1ea80808ed63c1. I think it was not because of safety concern, rather because region-simply was not considered to be effective on hardware IR (as far as I see the comment). Probably it would be good time to consider enabling region-simplify if there is no major complier time regression.