Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.
Add hessian estimation per image hash.
Add sample-layer attention distillation loss.
Add weights per layer to soft round loss.
Update GPTQ config and its generation for sample layer attention.
Checklist before requesting a review:
[ ] I set the appropriate labels on the pull request.
[ ] I have added/updated the release note draft (if necessary).
[ ] I have updated the documentation to reflect my changes (if necessary).
[ ] All function and files are well documented.
[ ] All function and classes have type hints.
[ ] There is a licenses in all file.
[ ] The function and variable names are informative.
Pull Request Description:
Add hessian estimation per image hash. Add sample-layer attention distillation loss. Add weights per layer to soft round loss. Update GPTQ config and its generation for sample layer attention.
Checklist before requesting a review: