Why is the compressed file one file instead of the pre trained weights, where there are many files for training the mode

OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

MIT License

626 stars 49 forks source link

Why is the compressed file one file instead of the pre trained weights, where there are many files for training the mode #73

Open hsb1995 opened 3 months ago

hsb1995 commented 3 months ago

Why is the compressed file one file instead of the pre trained weights, where there are many files for training the model? Can the compressed file be used for downstream tasks? It feels strange

ChenMnZ commented 2 months ago

We only save the parameters for let and lwc for training.

To save the quantized model, set the --save_dir argument.