google-ai-edge / ai-edge-torch

Supporting PyTorch models with the Google AI Edge TFLite runtime.
Apache License 2.0
278 stars 36 forks source link

Resnet50 returning wrong label prediction with .tflite and .tflite quantized #88

Closed RubensZimbres closed 1 month ago

RubensZimbres commented 1 month ago

I have the following code, where I load Resnet50 with ai-edge-torch, then I quantize the model to make inference.

However, differently from the tutorial with efficiendnet (224,224,3), the data format accepted is (batch, channels, height, width), like (16,3,224,224).

The code runs successfully, but the predicted labels are wrong. I am wondering if this just a reshape issue, or something more serious, like model conversion to tflite. Both tflite and tflite quantized return the same wrong label.

I'm running the code in an Anaconda environment, Ubuntu 22.04, Python 3.10.

imagenet_class_index.json was obtained from Kaggle.

Libraries versions:

ai-edge-torch 0.1.1 torch 2.4.0.dev20240429+cpu torch-xla 2.4.0+git174f407 torchaudio 2.2.0.dev20240429+cpu torchvision 0.19.0.dev20240429+cpu tensorflow-io-gcs-filesystem 0.37.1

THE CODE:

` import torch import torchvision import ai_edge_torch from PIL import Image import torchvision.transforms as transforms import tensorflow as tf import numpy as np from torch.ao.quantization.quantize_pt2e import prepare_pt2e, convert_pt2e from torch._export import capture_pre_autograd_graph from ai_edge_torch.quantize.pt2e_quantizer import get_symmetric_quantization_config from ai_edge_torch.quantize.pt2e_quantizer import PT2EQuantizer from ai_edge_torch.quantize.quant_config import QuantConfig import json

Initialize model

resnet50 = torchvision.models.resnet50().eval()

Convert to tflite

sample_input = (torch.randint(0,256,(1, 3, 224, 224),dtype=torch.float32),)

edge_model = ai_edge_torch.convert(resnet50, sample_input)

edge_model.export("/home/ai-edge/resnet50.tflite")

imported_edge_model = ai_edge_torch.load("/home/ai-edge/resnet50.tflite")

QUANTIZE TFLITE MODEL

pt2e_quantizer = PT2EQuantizer().set_global( get_symmetric_quantization_config(is_per_channel=True, is_dynamic=True) )

pt2e_torch_model = capture_pre_autograd_graph(resnet50,sample_input) pt2e_torch_model = prepare_pt2e(pt2e_torch_model, pt2e_quantizer)

Run the prepared model with sample input data to ensure that internal observers are populated with correct values

pt2e_torch_model(sample_input)

Convert the prepared model to a quantized model

pt2e_torch_model = convert_pt2e(pt2e_torch_model, fold_quantize=False)

Convert to an ai_edge_torch model

pt2e_drq_model = ai_edge_torch.convert(pt2e_torch_model, sample_input, quant_config=QuantConfig(pt2e_quantizer=pt2e_quantizer))

edge_model.export("/home/ai-edge/resnet50_quantized.tflite")

quantizeded_edge_model = ai_edge_torch.load("/home/ai-edge/resnet50_quantized.tflite")

########################################## INFERENCE ############################################

Load the TFLite model and allocate tensors

interpreter = tf.lite.Interpreter(model_path="/home/ai-edge/resnet50_quantized.tflite") interpreter.allocate_tensors()

Get input and output tensors

input_details = interpreter.get_input_details() output_details = interpreter.get_output_details()

Get input shape

input_shape = input_details[0]['shape']

Load and preprocess the image

image = Image.open('/home/ai-edge/car.jpeg')

preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), # Converts to float32 and scales to [0, 1] transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ])

img_tensor = preprocess(image) # img_tensor is now a FloatTensor with shape (3, 224, 224) img_tensor = img_tensor.unsqueeze(0) # Shape: (1, 3, 224, 224)

Convert img_tensor to numpy array and ensure it matches the expected dtype (uint8)

img_numpy = img_tensor.numpy() img_numpy = (img_numpy * 255).astype(np.uint8)

Set the tensor to the interpreter

interpreter.set_tensor(input_details[0]['index'], img_numpy)

Run inference

interpreter.invoke()

Get the output tensor

output_data = interpreter.get_tensor(output_details[0]['index'])

Print the output

print("Output:", output_data)

Apply softmax to the output data

probabilities = tf.nn.softmax(output_data[0])

Find the index of the highest probability

predicted_class = np.argmax(probabilities)

Load class labels

with open('/home/ai-edge/imagenet_class_index.json') as f: class_idx = json.load(f)

Now class_idx is a dictionary of class names

print(class_idx[str(predicted_class)][1])

`

RubensZimbres commented 1 month ago

I solved the problem, I was loading the torchvision model the wrong way:

resnet18 = torchvision.models.resnet50(torchvision.models.ResNet50_Weights.IMAGENET1K_V1).eval()