fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.17k stars 388 forks source link

Propagate zeros from Conv layers to multiplication config #797

Closed bo3z closed 1 year ago

bo3z commented 1 year ago

Description

:memo: Convolutional layers use Dense matrix multiplication at its core, but do not fully utilise the benefit of n_zeros in the Latency strategy of Dense layers.

  • When there are zeros, HLS can optimise the number of DSPs in Latency strategy, by using the resource pragma.
  • This is currently done for Dense layers and works quite well, even for RF != 1 (Tested as part of #768 )
  • However, n_zeros is always set to zero for Conv2D layers.

Type of change

Tests

  • No new tests, the current PyTests should verify no changes in Conv implementations
  • I can add some results from my synthesis of #768 to verify changes are correct.

Checklist