Closed RubensZimbres closed 3 months ago
Hi @RubensZimbres, seems like there are a couple of things going on here ...
My first question is: when Tensorflow converts the existing resnet18 model, does it automatically reshape the input format to tflite version?
Short answer: yes, here's my script:
import ai_edge_torch
import torch
from torch import nn
import torchvision
resnet18 = torchvision.models.resnet18(pretrained=True)
class PermuteInput(nn.Module):
def __init__(self):
super(PermuteInput, self).__init__()
def forward(self, x):
# Permute from (batch, height, width, channels) to (batch, channels, height, width)
return x.permute(0, 3, 1, 2).float()
resnet18 = torchvision.models.resnet18(pretrained=True)
resnet18_with_reshape = nn.Sequential(
PermuteInput(),
resnet18
)
sample_input = (torch.randint(0, 256, (1, 224, 224, 3), dtype=torch.uint8),)
resnet18_permuted = ai_edge_torch.convert(resnet18_with_reshape.eval(), sample_input)
resnet18_permuted.export("resnet18_permuted.tflite")
sample_input_2 = (torch.randint(0, 256, (1, 3, 224, 224), dtype=torch.float32),)
edge_resnet18 = ai_edge_torch.convert(resnet18.eval(), sample_input_2)
edge_resnet18.export("resnet18.tflite")
Without your manual adjustment this is how it looks like: input = (1, 3, 224, 224) \<inserted transpose> = (1, 224, 224, 3)
With your manual adjustment: input = (1, 224, 224, 3) \<manual transpose> = (1, 3, 224, 224)
Using our model-explorer tool here: https://huggingface.co/spaces/1aurent/model-explorer, https://github.com/google-ai-edge/model-explorer
You are correct this looks like an integration issue... I think media pipe reads the "GraphInputs" nodes and sees that the original follows the PT convention which causes the failure. I tried loading your modified model in Media Pipe Studio and run into the issue you are speaking of but I believe that's a MediaPipe issue since AET does the conversion properly as far as I can tell. Please create a MediaPipe issue for that one: https://github.com/google-ai-edge/mediapipe. I think for the original .. AET is also converting properly as we don't want to change someone's specified input shape (seen in sample_input_2). Media pipe is allowed to specify a specific input shape as well ... so I don't think that's a problem either. So I would say you actually "did the right thing" in using a custom model to satisfy your needs.
Thanks, @pkgoogle , my idea was in fact insert a customized model (not supported) into MediaPipe. At my last update, the only problem was to add metadata. I followed https://www.tensorflow.org/lite/models/convert/metadata and added the following, given that my model num_classes was 1000, and not 1001, as in https://github.com/tensorflow/tflite-support/raw/master/tensorflow_lite_support/metadata/python/tests/testdata/image_classifier/labels.txt -o mobilenet_labels.txt:
input_stats.width = [224]
input_stats.height = [224]
input_stats.num_classes = [1000]
The only issue open is that from tflite_support import metadata_schema_py_generated as _metadata_fb
from the metadata writer only runs successfuly on Tensorflow 2.13.0. But maybe this not regards ai-edge-torch directly. Please feel free to close this issue.
Thanks!
Hi @RubensZimbres you may find better support for that particular issue here: https://github.com/tensorflow/tflite-support. As requested closing, thanks for your help!
In ImageClassifier model compatibility requirements () it is said that the input format for the tflite exported model must be image input of size [batch x height x width x channels]
However, the code in the repo states that image input of size [batch x channels x height x width]
I'm using Python 3.10 in Anaconda plus:
My first question is: when Tensorflow converts the existing resnet18 model, does it automatically reshape the input format to tflite version?
Because I am adding a customized .tflite (with ai-edge-torch) to a MediaPipe Image Classifier, and it does not work. I exported via TFL, with and withour metadata, quantized or not. None of them work, maybe because of input shape.
This means that ai-edge-torch successfully converts PyTorch models to .tflite. And MediaPipe uses .tflite in its image classification inference. However, ai-edge-torch library inputs and outputs a (1,Height,Width, Channels) (https://github.com/google-ai-edge/ai-edge-torch), but MediaPipe ImageClassifier only works with (1, Height, Width, Channels) (https://www.tensorflow.org/lite/inference_with_metadata/task_library/image_classifier). So my idea was to use a customized classifier in MediaPipe, but it looks like both products don't talk to each other.
I tried:
It solves the input incompatibility, but tflite still does not work on MediaPipe, even with signature and metadata. Error in the web page, for
uint8
:Error: INVALID_ARGUMENT: Classification tflite models are assumed to have a single subgraph.; Initialize was not ok; StartGraph failed
I also tried with float32 and it didn't work. I get the following error in MediaPipe interface:
Error: INVALID_ARGUMENT: Classification tflite models are assumed to have a single subgraph.; Initialize was not ok; StartGraph failed
UPDATE:
I was able to make it work without errors with this code:
I uploaded the model to
https://netron.app/
and got this:However, there is no classification, given that I am unable to add metadata to the model (resnet_labels.txt). When I do this, MediaPipe does not accept the tflite model. Other issue is that ai-edge-torch runs on Tensorflow 1.27.0 and the metadata writer only runs on Tensorflow 1.23.0. So, 2 environments are neccessay, what is counter productive.