This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
226
stars
25
forks
source link
fix ln_fc fuse && support 128g for abitrary shape #49
Closed
Harahan closed 3 weeks ago