sony / model_optimization

Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.
https://sony.github.io/model_optimization/
Apache License 2.0
322 stars 50 forks source link

Fix activation gradient backprop in GPTQ #1197

Closed irenaby closed 1 month ago

irenaby commented 2 months ago

Pull Request Description:

Add freeze_quant_params flag to base trainable quantizer with False as default. Implement quant params freezing for STE activation quantizers. Use activation trainable quantizers in GPTQ instead of inferable quantizers, with frozen quant params.

Checklist before requesting a review: