sony / model_optimization

Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.
https://sony.github.io/model_optimization/
Apache License 2.0
322 stars 50 forks source link

Filter out candidates for mixed precision in case of activation/weights only #1162

Closed ofirgo closed 2 months ago

ofirgo commented 2 months ago

Pull Request Description:

When running mixed precision quantization for weights/activation compression only but there are layers with multiple activation bit-width candidates, we need to filter out the irrelevant activation candidates (to make the layers non-configurable). The same goes for activation only mixed precision.

For this, we added a new procedure before the mixed precision search which modified the candidates of the relevant nodes in the graph.

Checklist before requesting a review: