efeslab / Atom

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
277 stars 24 forks source link

How to load quantized weight? #14

Closed ghost closed 7 months ago

ghost commented 7 months ago

Hi, I want to how to load the quantized weight to do evaluation?

happierpig commented 7 months ago

Hi @mxjyst ,

Thanks for your interest in our project. Currently, the codebase is a prototype for reproducing the experiments in our paper. We do not save the quantized weight and each time we do the fake quantization from scratch. Sorry for any inconvenience. Please stay tuned for future enhancements.