tensorflow / models

Models and examples built with TensorFlow
Other
77.01k stars 45.78k forks source link

Using TF2.4 TFLite Model in Android App - Error #9205

Open mihir-chauhan opened 4 years ago

mihir-chauhan commented 4 years ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

1. The entire URL of the file you are using

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_on_mobile_tf2.md

2. Describe the bug

I am able to convert the Saved Model (.pb) to a TensorFlow Lite (.tflite) file using the link above, but when I try to run on an android phone, it gives me an error that java.lang.IllegalArgumentException: ByteBuffer is not a valid flatbuffer model. Even with using both a custom model and the SSD MobileNet v2 320x320, it gives me the same error. @srjoglekar246 said that the issue is with wrong constants used for num_classes, etc, but I am unsure of how to fix this.

3. Steps to reproduce

I am using one of the later versions of tf-nightly (2.4.0-dev20200904) and am using the scripts found here which are made by @srjoglekar246 and convert the saved model to tflite.

4. Expected behavior

Usually, it should work without error. However, it is giving the java.lang.IllegalArgumentException: ByteBuffer is not a valid flatbuffer model error. When I run with the same setup with TF 1.14 model, it works perfectly fine.

5. Additional context

@srjoglekar246 has made changes and I have followed everything based on issue #9033

6. System information

hahmad2008 commented 4 years ago

@srjoglekar246 after converting using the latest script you mentioned, I got this error

srjoglekar246 commented 4 years ago

Please use the latest nightly version of tflite_convert to use the latest changes :-)

On Mon, Sep 7, 2020 at 7:14 AM hahmad2008 notifications@github.com wrote:

@srjoglekar246 https://github.com/srjoglekar246 after converting using the latest script you mentioned, I got this error https://github.com/tensorflow/tensorflow/issues/42735#issuecomment-688351178

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/models/issues/9205#issuecomment-688353183, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQAQXPJUFT4R2CCATKX5NTSETTFVANCNFSM4Q5KFGCQ .

-- Sachin Joglekar | srjoglekar@google.com srjoglekar@google.com| | +1-650-309-9486

mihir-chauhan commented 4 years ago

Hi @srjoglekar246, will updating the tf-nightly version fix the issue I am running into, or is this for fixing @hahmad2008's issue? If it is regarding the issue I am running into, do I only need to redo thetflite_convert, or do I have to recreate the export tflite graph? Thanks!

srjoglekar246 commented 4 years ago

@mihir-chauhan This might be a bug with our detection app, that needs to be updated for TF2 models. Lemme take a look at this.

srjoglekar246 commented 4 years ago

@mihir-chauhan You will need to modify the app code based on the model you are using. See these parameters.

For example, TF_OD_API_IS_QUANTIZED should be false since you are not using a quantized model, unless you used post-training quantization during conversion. Similarly, is your model named the same as what the app expects (detect.tflite), and in the assets folder?

Plus, I think your model only has two classes? The app is meant for the COCO detection task which detects around 80 classes, so you will probably have to modify the labels file. It is difficult to predict exactly which variables may be causing the problems, so you will probably have to dig around the code a bit and debug, since the app needs to be tailored for the model & the use-case :-). But hunting around the files I have linked here will probably give you some starting points on where to look.

mihir-chauhan commented 4 years ago

Hi @srjoglekar246,

I have played around with the parameters and creating the tflite model using different parameters like image_shapes, etc. but it looks like it still is giving me the same error. Could you please try making a .tflite model based on the SSD MobileNet v2 320x320? It would be great if you can upload it here and then I can try using that to see if I am missing something in the app or the issue is in making and converting to tflite for my side. Thanks!

srjoglekar246 commented 4 years ago

detect.tflite.zip Note that it has 300x300 input dimensions for some reason :-/.

mihir-chauhan commented 4 years ago

Hi @srjoglekar246,

I have tried the model you sent and have tried testing with both setting quantized variable to true and false and it seems that either way, I am getting the same error.

This is the error I am running into: java.lang.NullPointerException: Internal error: Cannot allocate memory for the interpreter: tensorflow/contrib/lite/kernels/reshape.cc:56 stretch_dim != -1 (12 != -1). I am using this repo: https://github.com/FIRST-Tech-Challenge/SkyStone, and this is the specific file that I am using to run it. The assets folder is here. It would be great if you could try it out with this setup or if there is a simple fix in the code to this issue. Thanks a lot!

srjoglekar246 commented 4 years ago

Aah... I am not sure. You will need to inspect both the Skystone model & the SSD model with something like Netron. The inputs that the app code expects might not be the same.

You can look at the tensorflow example app to see how to use TFLite in Java.

mihir-chauhan commented 4 years ago

I put the model in Netron and the main visual differences between the model made from 1.14 and 2.4 are: the name is serving_default_input:0 in the 2.4 model instead of normalized_input_image_tensor, found in the 1.14 model. Another difference is the output where it says TFLite_Detection_PostProcess: for the 1.14 model, but on the 2.4 model, it uses StatefulPartitionedCall:.

It also has more layers towards the end compared to the 1.15 (I think that is just the change in structure since the 2.4 is not quantized).

Here is an image of the difference I am referring to in the ending layers:

TF 2.4 Model on Netron: image TF 1.14 Model on Netron: image

Do I have to use TOCO to convert or would MLIR work the same?

I have also tried using the TF Examples app for android and it seems that it is always overwriting the model I put into the assets folder. Not sure what the problem is there.

srjoglekar246 commented 4 years ago

The changed tensor names are expected. The model still works in terms of what the tensors mean. TOCO cannot convert TF2 detection models.

Not sure why the TF Examples app is over-writing the model. Does it 'create' a detect.tflite file?

mihir-chauhan commented 4 years ago

Yes, it is overwriting the detect.tflite model that I put into the folder upon the Gradle run. I think the gradle.download may be causing that issue, however.

mihir-chauhan commented 4 years ago

@srjoglekar246, Is there a way to fix this issue in testing the model?

srjoglekar246 commented 4 years ago

I am not that familiar with Android development, unfortunately :-(

mihir-chauhan commented 4 years ago

That's alright :). Is there someone that you can assign to this issue to help me out? Thanks!

davita8 commented 4 years ago

@mihir-chauhan Is such a download_model.gradle which rewrites the model. You can rename the private static final String TF_OD_API_MODEL_FILE = "detect.tflite" to "abc.tflite";

mihir-chauhan commented 4 years ago

Thanks @davita8, I will try doing that.

hxhexi commented 4 years ago

@mihir-chauhan what's the file size of the converted tflite?

mihir-chauhan commented 4 years ago

I think it's around 7000 KB

hxhexi commented 4 years ago

I think it's around 7000 KB

good for you! my model is ssd_mobilenet_v2_fpn_640*640 and the converted tflite model is 516bytes so sad!! still dont know why. we use the same link https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_on_mobile_tf2.md

srjoglekar246 commented 4 years ago

@hxhexi Please use the latest TF nightly for conversion (tflite_convert or Python API)

davita8 commented 4 years ago

@srjoglekar246 1) For Android segmentation, there is only DeepLab v3, but Custom Trainer works at tf1.15. How to work with the latest "tf nightly" 2) ssd_mobilenet_v2_fpn_640 * 640 Version 2.3 Why does the conversion not work?