Closed shadjis closed 7 years ago
The Counter constructors now take options for start, stop, stride, and gap so whenever possible they should do the math at compile time. We can reopen this issue if experiments do not agree that they are cheaper now
We notice counters using DSPs, e.g.
acceli_12/\Counter_28/ctrs_1/ZynqBlackBoxesMultiplier_1/m/U0/i_mult/\gDSP.gDSP_only.iDSP/use_prim.appDSP48[0].bppDSP48[1].use_subtract_delay.subtract_delay/d1.dout_i_reg[0]
When stride is constant (and parallelism is always constant I think), it should be possible to eliminate all the multipliers. This would not only save DSPs but also may significantly reduce synthesis time by not requiring as many cross boundary optimizations.
In general, a lot of area may be able to be saved for a common case of constant max, constant stride, counting upwards, etc. An example of such a counter can be found here: https://github.com/stanford-ppl/spatial-lang/blob/8d872264488376f0768971b1a088a1eaded8cb53/src/spatial/codegen/chiselgen/resources/template-level/templates/LineBuffer.scala#L10