Xilinx / finn

Dataflow compiler for QNN inference on FPGAs
https://xilinx.github.io/finn
BSD 3-Clause "New" or "Revised" License
714 stars 226 forks source link

Does FINN support prunning? #792

Closed jurevreca12 closed 1 year ago

jurevreca12 commented 1 year ago

I have searched the documentation, the code and the issues, but I can't find any mention of support for pruned neural networks. So my question is, does FINN support pruning? If not, is it on the roadmap? If it does, is there an example somewhere?

Motivation

Pruning is a useful technique to further decrease the size of a neural network, and thus would potentially enable FINN to deploy networks with fewer resources.

Pruning is supported by hls4ml and can be achieved quite easily with the tensorflow model optimization toolkit.

auphelia commented 1 year ago

Hi @jurevreca12, On the training side pruning is easy to do in Brevitas (e.g. with the PyTorch pruning toolkit). To take advantage in hardware, there are currently two options:

jurevreca12 commented 1 year ago

Hey @auphelia, thank you for this information. Just to make sure I've understand correctly. I need to set PE to the number of neurons in a layer, and SIMD to the number of inputs. Correct?

auphelia commented 1 year ago

In the case of the MVAU for example this means: PE=matrix height and SIMD=matrix width. Here is some additional information about folding: https://github.com/Xilinx/finn/blob/github-pages/docs/finn-sheduling-and-folding.pptx