tensorflow / tflite-micro

Infrastructure to enable deployment of ML models to low-power resource-constrained embedded targets (including microcontrollers and digital signal processors).
Apache License 2.0
1.74k stars 769 forks source link

Fail per-channel quantized FullyConnected layers #2602

Closed rascani closed 1 month ago

rascani commented 1 month ago

The TFLiteConverter recently switched over to using per-channel quantization for all Dense/FullyConnected layers. TFLite-Micro does not yet have support for this, and was using incorrect quantization parameters for FullyConnected layers on newly converted models. Unsurprisingly, this leads to invalid output.

While we intend to add per-channel quantization support for FullyConnected, this PR adds a runtime check for per-channel quantization until it can be supported by individual kernels. If you encounter this runtime error, you can disable the new Converter behavior by setting:

TfLiteConverter._experimental_disable_per_channel_quantization_for_dense_layers = True https://github.com/tensorflow/tensorflow/blob/377f47694fa790e98db6665b9adecde00b5e0d68/tensorflow/lite/python/lite.py#L674

BUG=b/324385802