fastmachinelearning / qonnx

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX
https://qonnx.readthedocs.io/
Apache License 2.0
124 stars 39 forks source link

Handle ExtractConvBias properly and remove unused bias Quant nodes #122

Open joannapng opened 4 months ago

joannapng commented 4 months ago

During the ExtractBiasFromConv transformation, sometimes the bias initializer was not found. The current code searches for the bias initializer with the name of the tensor that acts as input to the convolution node, but it should search for the initializer with the name of the first input to the Quant node that produces this tensor. For the same reason, the producer node should be deleted and not the tensor.

maltanar commented 2 months ago

Hi @joannapng , thanks for the PR! It would be nice to be able to handle quantized biases with ExtractConvBias, but there's a few problems that need to be fixed first.

  1. Your changes break the transformation for Conv nodes with non-quantized biases, the second line here accesses producer.input without checking if producer is None which can happen for non-quantized biases:

                        producer = model.find_producer(n.input[2])
                        bias = model.get_initializer(producer.input[0])
  2. To handle quantized biases correctly, the second input to the Add node we insert should be the quantized bias (the output of the Quant node for the bias) and not the un-quantized input to the bias quantizer itself. It seems you are currently removing the Quant node entirely, which would not preserve the semantics of the original network: so currently rewriting Conv(x, bias=Quant(unquant_b)) to Conv(x) + unquant_b but it should be Conv(x) + Quant(unquant_b) instead.

If you could make this fix and perhaps add a unit test that checks both the quantized and unquantized bias cases, that would be great!