issues
search
efeslab
/
Atom
[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
277
stars
24
forks
source link
feat: add FP4 evaluations
#11
Closed
happierpig
closed
7 months ago
happierpig
commented
7 months ago
This PR introduces the following enhancements:
Integret new data format support for Atom, e.g., FP4 (
https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf
). Utilize BitsandBytes for quantization (
https://github.com/TimDettmers/bitsandbytes
).
Polish and add more comments in codes. Polish the README.md.
This PR introduces the following enhancements: