Open EstellaXinyuZhang opened 1 month ago
does "GPU 0: A100-SXM-80GB" support FP8 ?
does "GPU 0: A100-SXM-80GB" support FP8 ?
yes, I used auto-fp8 to quantize models such as internlm-chat-7b and facebook/opt-125m and loaded the quantized models. They can work.
Can you post the model to the hf hub so I can take a look?
Your current environment
🐛 Describe the bug
I use auto_fp8 to quantize the model internlm/internlm2-chat-7b.
Then I would like to load the fp8 model, but got error.
error is