fastmachinelearning / qonnx

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX
https://qonnx.readthedocs.io/
Apache License 2.0
124 stars 39 forks source link

Hgq/support for qonnx #123

Open makoeppel opened 4 months ago

makoeppel commented 4 months ago

This is a first prototype of integrating HGQ into QONNX. I wanted to get this PR in to discuss how we will process especially with the FINN backend and to start a discussion.

How is it done & Questions: The from_keras() function was changed to check for HGQ models (what about mixed models?). The Proxy Model from HGQ works a bit differently then just having the quantization information in each layer. It is more adding quantization information after the layer which will be quantized. This layers are FixedPointQuantizer. For qkeras the quantizers are replaced with the equivalent keras layers (e.g. q_conv2d --> conv2d). For now I added this option by replacing the FixedPointQuantizer with a custon Identity layer but it is not used since just taking the pure proxy model is enough. Here one should do some cleanup but I was unsure what option is the best. After this process the quantizers are used to add a new op type layer FixedPoint after each FixedPointQuantizer layer using tf2onnx.convert.from_keras (in the HGQ style). However, this FixedPointQuantizer are not needed after this process so I used the cleanup_model() function to remove these FixedPointQuantizer and only keep the new FixedPoint nodes.

Tests:

ToDos: