ModelTC / llmc

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
https://arxiv.org/abs/2405.06001
Apache License 2.0
328 stars 36 forks source link

llmc可以支持smoothqaunt的w8a8在trt-llm后端推理吗? #136

Closed GuangyanZhang closed 1 month ago

GuangyanZhang commented 1 month ago

export_trtllm.py 量化类型直接指定quant_config.quant_algo = QuantAlgo.W4A16,是否目前还不支持激活的量化呢?

gushiqiao commented 1 month ago

@helloyongyang

Harahan commented 1 month ago

We don't update the export_trtllm.py for a long time.