A few user experience enhancements during the quantization and measurement process:
Graceful Exit Signal Handling:
Adds signal handling to allow the measurement process of quantization to exit gracefully. This ensures that the process can be safely paused or stopped, and provides a catch / second chance if a user hits CTRL-C unintentionally.
Status Box with useful insights:
Implements a status box that appears after each measurement process completes of a module in the step-process that provides valuable insights during the quantization process. Most useful is probably the overall accuracy of the quantization and measurement process at that precise moment in time, and a time estimate for completion of the full quant.
Stats in the status box include:
Rolling average time per step.
Estimated time for completion of the full process. (Uses rolling avg with a default window of last 10 modules)
Step tracking (current step out of total steps). This also has a graceful resume that attempts to find which step it is on on out of the total layers remaining out of steps completed so far.
Overall average accuracy of all measurements taken during the quantization process.
The idea here is simply to improve the user experience by providing better control and visibility during the quantization process. For those paying for compute to quantize, the time to completion estimates can assist with calculating compute costs.
A few user experience enhancements during the quantization and measurement process:
Graceful Exit Signal Handling:
Status Box with useful insights:
Stats in the status box include:
The idea here is simply to improve the user experience by providing better control and visibility during the quantization process. For those paying for compute to quantize, the time to completion estimates can assist with calculating compute costs.
https://github.com/turboderp/exllamav2/assets/5460972/a010f4f2-f575-49b7-8297-df3c75b9cf33