Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.
Add freeze_quant_params flag to base trainable quantizer with False as default.
Implement quant params freezing for STE activation quantizers.
Use activation trainable quantizers in GPTQ instead of inferable quantizers, with frozen quant params.
Checklist before requesting a review:
[ ] I set the appropriate labels on the pull request.
[ ] I have added/updated the release note draft (if necessary).
[ ] I have updated the documentation to reflect my changes (if necessary).
[ ] All function and files are well documented.
[ ] All function and classes have type hints.
[ ] There is a licenses in all file.
[ ] The function and variable names are informative.
Pull Request Description:
Add freeze_quant_params flag to base trainable quantizer with False as default. Implement quant params freezing for STE activation quantizers. Use activation trainable quantizers in GPTQ instead of inferable quantizers, with frozen quant params.
Checklist before requesting a review: