Open KhaoKhao opened 1 month ago
@KhaoKhao thank you for your feedback. As a first step, can you verify your script works on the non-quantized model ? If so, can you provide that script so that the issue can be reproduced ?
@KhaoKhao thank you for your feedback. As a first step, can you verify your script works on the non-quantized model ? If so, can you provide that script so that the issue can be reproduced ?
Sure. I've tested the script with the non-quantized model, and it works perfectly fine. Here is the script I used (I implemented it based on the quantize_mnist_model.py in the repo).
def test(model, test_loader, device, criterion):
model.to(device)
model.eval()
test_loss = 0.0
correct = 0
with torch.no_grad():
start = time.time()
for image, label in test_loader:
image, label = image.to(device), label.to(device)
output = model(image)
if isinstance(output, QTensor):
output = output.dequantize()
loss = criterion(output, label)
test_loss += loss.item()
pred = torch.argmax(output, 1)
correct += torch.sum(pred == label.data)
avg_loss = test_loss/len(test_loader)
accuracy = 100 * correct/len(test_loader.dataset)
end = time.time()
torch.save(model.state_dict(), 'tmp.pt') # Get model size
print(output.shape)
print(f' -Evaluateed in {end-start} s.\n -Average loss : {avg_loss} \n -Accuracy : {accuracy} \n -Size : {"%.2f MB" %(os.path.getsize("tmp.pt")/1e6)}')
os.remove('tmp.pt')
This part work perfectly fine
###------- Loading model -------###
model = models.resnet18(pretrained=False)
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, 2)
model = model.to(device)
# Load weight
model.load_state_dict(torch.load('PATH_TO_MODEL_WEIGHT'))
###------- Float model -------###
print('Float model :')
test(model, test_loader, device, criterion)
But this part gave the result mentioned above.
###------- Quantization -------###
###------- int8 -------###
print('Quantize int8 model :')
quantize(model, weight=qint8, activations = None)
print(' *Calibrating...')
print(model)
with Calibration():
test(model, test_loader,device,criterion)
I've tried other models as well, and it seems the problem originates from the conv2d layer. May be the weight argument is missing after quantized? PS: I've tried using your model (dacorvo/mnist-mlp), and it works perfectly fine.
Thanks for your help!
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
@KhaoKhao I don't know which model you used for your tests, but this works on my setup, either with optimum-quanto
0.2.3
, 0.2.4
or on the main
branch:
from transformers import AutoImageProcessor, AutoModelForImageClassification
from datasets import load_dataset
from optimum.quanto import Calibration, quantize, qint8
dataset = load_dataset("huggingface/cats-image")
image = dataset["test"]["image"][0]
image_processor = AutoImageProcessor.from_pretrained("microsoft/resnet-18")
model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-18")
quantize(model, weights=qint8, activations=qint8)
inputs = image_processor(image, return_tensors="pt")
with Calibration():
logits = model(**inputs).logits
print(logits)
Hi, I’m encountering an issue when trying to quantize the model. The quantization process completes successfully, but when I attempt to calibrate or make a prediction, I receive the following error: :
I investigated the error and found that it seems to originate from line 10.
and here is the model after quantized
I would greatly appreciate any assistance you could provide with this problem. I have been trying to find a solution but have been unsuccessful so far. Thank you in advance for your help.