cms-L1TK / firmware-hls

HLS implementation of the tracklet pattern reco modules of the Hybrid tracking chain
15 stars 24 forks source link

TrackBuilder with ordered merges #333

Closed aehart closed 5 months ago

aehart commented 6 months ago

This PR rewrites the TrackBuilder to use ordered merges to achieve the required ordering of the outputs. This was first implemented by @aryd for L1L2, and has now been generalized to all seeds. The output in C-simulation is unchanged.

The post-implementation results are below. The timing for most seeds is slightly improved, but it is slightly worse for some seeds (namely L1L2, D3D4, and L2D1). Timing is still met in all cases. The real improvements though are in the resource utilization, where we get 59-67% less LUT utilization and 33-40% less FF utilization.

Post-implementation results

Minimum clock period

seed old TrackBuilder new TrackBuilder delta
L1L2 3.877 ns 3.939 ns +1.6%
L2L3 3.905 ns 3.671 ns -6.0%
L3L4 3.894 ns 3.736 ns -4.1%
L5L6 3.641 ns 3.579 ns -1.7%
D1D2 3.799 ns 3.675 ns -3.3%
D3D4 3.550 ns 3.551 ns +0.028%
L1D1 3.748 ns 3.516 ns -6.2%
L2D1 3.624 ns 3.657 ns +0.91%

LUT utilization

seed old TrackBuilder new TrackBuilder delta
L1L2 15661 5234 -67%
L2L3 15475 5318 -66%
L3L4 12791 4578 -64%
L5L6 8335 3179 -62%
D1D2 10161 3893 -62%
D3D4 8316 3234 -61%
L1D1 5865 2401 -59%
L2D1 8319 3233 -61%

FF utilization

seed old TrackBuilder new TrackBuilder delta
L1L2 10002 5981 -40%
L2L3 9791 5916 -40%
L3L4 8405 5129 -39%
L5L6 5601 3702 -34%
D1D2 6748 4518 -33%
D3D4 5770 3783 -34%
L1D1 4774 3103 -35%
L2D1 5770 3783 -34%