Closed EstellaXinyuZhang closed 1 week ago
does "GPU 0: A100-SXM-80GB" support FP8 ?
does "GPU 0: A100-SXM-80GB" support FP8 ?
yes, I used auto-fp8 to quantize models such as internlm-chat-7b and facebook/opt-125m and loaded the quantized models. They can work.
Can you post the model to the hf hub so I can take a look?
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant. Thank you!
Your current environment
🐛 Describe the bug
I use auto_fp8 to quantize the model internlm/internlm2-chat-7b.
Then I would like to load the fp8 model, but got error.
error is