Hi! I used export_int8_model.py to smooth, quantize and export INT8 models for opt-1.3b. The model size has changed from 2509MB to 1357MB, which seems that the quantization is successful. But when I evaluated the int8 model, the following error occurred. It seems to be a problem with cutlass. How do I solve this problem?
Looking forward to your reply! Thanks!
Hi! I used export_int8_model.py to smooth, quantize and export INT8 models for
opt-1.3b
. The model size has changed from 2509MB to 1357MB, which seems that the quantization is successful. But when I evaluated the int8 model, the following error occurred. It seems to be a problem with cutlass. How do I solve this problem? Looking forward to your reply! Thanks!Here is my environment:
Here is the detailed error message: