Adding OPT support for the simulated quantization. - Githubissues

efeslab / Atom

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

277 stars 24 forks source link

Adding OPT support for the simulated quantization. #5

Closed cylinbao closed 9 months ago

cylinbao commented 9 months ago

Implement Atom simulated quantization for OPT family
Reconstruct the codebase for /model.
Rename the main file from llama.py to main.py.
Add OPT perplexity results of WikiText2 in README.