failed to save quantizationed model

ModelTC / llmc

This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Apache License 2.0

228 stars 25 forks source link

save_trans

save:
    save_trans: True
    save_lightllm : False
    save_fake: False
    save_path: /extra_data/mali36/llmc/models/
when I used the above config, I get a 16G model, when I used the following config, I get a 29G model, but the AWQ-model is 5G, could u help me ?
    save_trans: False
    save_lightllm : True
    save_fake: False
    save_path: /extra_data/mali36/llmc/models/

The model size saved by save_trans should be the same as the original model size. We are still troubleshooting save_lighllm, so you can first try save_vllm:True and use the vllm engine for inference. You can refer to this document: https://llmc-en.readthedocs.io/en/latest/backend/vllm.html

ModelTC / llmc

failed to save quantizationed model #97