OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
663 stars 50 forks source link

Minor improvements #9

Closed jeethu closed 11 months ago

jeethu commented 12 months ago
  1. Allow using multiple GPUs in generate_act_scale_shift.py
  2. Allow quantizing models that have the same architectures as supported models but different model names with the --net command line argument.
  3. Allow using differently named activation shifts and scales files with the --act-scales and --act-shifts command line arguments.
ChenMnZ commented 11 months ago

Thank you for your valuable contribution for OmniQuant.