Closed kechan closed 1 month ago
@kechan you're right: if you quantize the activations, a calibration is required. You can refer to the MNIST classification example to see how it works: https://github.com/huggingface/optimum-quanto/blob/main/examples/vision/image-classification/mnist/quantize_mnist_model.py
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.
I am testing out using optimum.quanto on this open clip model:
I then quantized it like this:
I proceeded to test inference on an image (to obtain its vector representation):
The image_features look wildly diff. In particular, the one from the quantized model has lot of zeros in it:
I proceeded to test out my downstream task accuracy and it is totally destroyed. I think I may be missing something here and I am not doing something right. If you quantize the activations (which pytorch I think refer to as static quantization??), is calibration mandatory? if you have experience, please let me know. I will try it anyway when I get around.