issues
search
efeslab
/
Atom
[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
277
stars
24
forks
source link
Adding OPT support for the simulated quantization.
#5
Closed
cylinbao
closed
9 months ago
cylinbao
commented
9 months ago
Implement Atom simulated quantization for OPT family
Reconstruct the codebase for
/model
.
Rename the main file from
llama.py
to
main.py
.
Add OPT perplexity results of WikiText2 in README.
/model
.llama.py
tomain.py
.