Closed rajat-008 closed 2 weeks ago
As specified in the error message, you cannot quantize a Linear with a single output feature. The code should detect that and switch to a quantization per-tensor, so it is a bug. Try increasing the number of output features: this should work.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.
having the same issue with loading OWLv2 model into 8-bit, when is this going to be fixed? @dacorvo this is a zero-shot model so I cannot change much.
@merveenoyan in addition to fixing this I added an object detection example based on OWLv2. https://github.com/huggingface/optimum-quanto/blob/main/examples/vision/object-detection/quantize_owl_model.py You need to install the package from the main branch to use it.
Above is the model I have defined. When I try to quantize the above, and call freeze I get the below error-