f0cal / google-coral

Community gathering point for Google Coral dev board and dongle knowledge.
108 stars 14 forks source link

How to deply my own .Tflite file #20

Open caiya55 opened 5 years ago

caiya55 commented 5 years ago

Now I have my own complied .tflite model, the file is already passed the Compile your model for the Edge TPU, and visualize.py shows that the operators are all become UINT8 type. Now I'd like to know how to deploy the model and make it run in TPU. The python API in the official website provides two engines, one is edgetpu.classification.engine, and another is edgetpu.detection.engine. But my mode is the open-pose (human pose estimation) model, so the output is different.

Is anyone working on the deployment of your own model? I will appreciate that if someone could give me some clues.

caiya55 commented 5 years ago

I found the solution, it is not working very well, but I will update it in this issue.

if you want deploy your own tflite model in the Coral, basically you need to follow these steps:

  1. convert your .pb file to .tflite file. And make sure your tflite pass the Edge TPU Model Compiler. And downloaded the .tflite file, this is your tflite-edgetpu file.
  2. upload your model into the device by mdt push command. (scp is also working)
  3. in your python script, using the official python API to load the model. My code looks like this
    
    from edgetpu.basic.basic_engine import BasicEngine
    import numpy as np
    from PIL import Image

image_path = "./images/p1.jpg" model_path = 'open_pose_tflite' target_size=(432, 368) output_size = [54,46,57] '''load the image''' image = Image.open(image_path) image = image.resize(target_size, Image.ANTIALIAS) image = np.array(image).flatten()

'''load the model''' engine = BasicEngine(model_path) result = engine.RunInference(input_tensor = image) process_time = result[0] my_model_output = result[1]. reshape(output_size)

caiya55 commented 5 years ago

Basically, this is working if you convert your model correctly. But I get weird output now. And one thing that really bothers me is that the type of output in my .tflite file is UINT8, but actually, it is FLOAT32 in .pb file.

I am still working on it. According to the official tflite, their output is FLOAT32, so I am checking the converting way, maybe there are some tricks.

And you can use tensorflow/lite/tools/visualize.py to check the tensors of your tflite model. It is very convenient.

caiya55 commented 5 years ago

I found a good softeware called netron. It is very powerful softeware to visualized your model, and find the name of input and output. It is more efficient than visualized.py. So I recommended netron now.

Charlesl0129 commented 5 years ago

Hi caiya have you solved the problem yet? I'm in a similar situation trying to get pose estimation running. But model.RunInference gives segmentation fault...

caiya55 commented 5 years ago

Yes, as for the UINT8 output problem, it is solved, it seems like the official API will use code to convert your UINT8 output into float32. So finally you will get float32 output. Here is the source code from Coral support team:

const auto& output_indices = interpreter->outputs();
  const int num_outputs = output_indices.size();
  int out_idx = 0;
  for (int i = 0; i < num_outputs; ++i) {
    const auto* out_tensor = interpreter->tensor(output_indices[i]);
    CHECK(out_tensor);
    if (out_tensor->type == kTfLiteUInt8) {
      const int num_values = out_tensor->bytes;
      const uint8_t* output = interpreter->typed_output_tensor<uint8_t>(i);
      CHECK(output);
      for (int j = 0; j < num_values; ++j) {
        output_data[out_idx++] = (output[j] - out_tensor->params.zero_point) *
                                 out_tensor->params.scale;
      }
    } else if (out_tensor->type == kTfLiteFloat32) {
      const int num_values = out_tensor->bytes / sizeof(float);
      const float* output = interpreter->typed_output_tensor<float>(i);
      CHECK(output);
      for (int j = 0; j < num_values; ++j) {
        output_data[out_idx++] = output[j];
      }
    } else {
      LOG(FATAL) << "Tensor " << out_tensor->name
                 << " has unsupported output type: " << out_tensor->type;
    }
    CHECK_LE(out_idx, output_size);
  }

But after that, I got a float32 output but it is not correct. Quickly I get the reason. Since the opt.pb provided by the open-pose GitHub project is not a quantization-Aware Training model, so the output from the open-pose model is very different from the correct output because you have to set the converter.default_ranges_stats when you convert the .pb file into the .tflite file. This is because the max-min numbers are required and the correct min-max numbers are missing from the original .pb file. I try different ranges, such as (0,6), (-1,2), or others, but I can't find a good range to make the final result even close to the correct results. If you can find one, please let me know. The recommended way is to train the open-pose estimation model again with quantization-aware training flag =1.

As for the segmentation fault, sorry I didn't get that error before. Do Do you use the open-pose model from the here? Which model do you use, and does your model pass the Compile your model for the Edge TPU?

Anyway, I'm glad there is someone working on a similar thing, Let's keep sharing information.

mbcel commented 5 years ago

I am also having difficulties to get my own model to run on the Edge TPU. I am using the local ubuntu compiler to compile my .tflite file. However I realized that the compilation does not work if my image input size is too large. The compilation just hangs and gives an internal error without further specifying what error actually occured. (Do you get any further errors during compilation or just a general internal error?)

Did you try a bigger input size with your net (e.g. double your currently used image size) and did it still compile? I fell like there is a limit on the maximum number of nodes/calculations that can be done in one layer. If that is too high it feels like it is not compiling without telling me whats actually wrong. Do you maybe know more about that? And what inference times do you get right now with your current image size?

Regarding your problem I think you should defenitly retrain your model with quantization aware training since it is really not straight forward to do post-quantization.

Good to know I am not the only one having difficulties :)

AlbertoLanaro commented 5 years ago

@caiya55 did you use the model from here? In that case, which model did you use? Have you tried to retrain the model with quantization aware training?

Thank you.

caiya55 commented 5 years ago

Yes, I get the model from ildoonet/tf-pose-estimation. I already get the quantized aware training model based on Mobilenet v2. Training script is available in tf_pose/train.py. You can change the quantization flag. After that, you can run the run_checkpoint.py model to obtain the eval_graph.pb, and freeze it and convert it to tflite. I get unsupported op error when I try to convert the frozen pb file intot tflite file. I am still working on it. If anyone has a similar problem like unsupported op (cast, size or Fusebatchnorm V3) for quantization, please share the information.

AlbertoLanaro commented 5 years ago

Thank you @caiya55. I'm trying to train the model with quantization aware training but I'm having some issues with the training.py script. Can you please share your quantization aware trained model? Thank you!

caiya55 commented 5 years ago

Do you have wechat?

YaroslavSchubert commented 5 years ago

Check out this thread about quantization-aware training and the new tool for post-training quantization, might be useful for you https://github.com/tensorflow/tensorflow/issues/27880#issuecomment-501436485

Mxgra commented 5 years ago

I am also having difficulties to get my own model to run on the Edge TPU. I am using the local ubuntu compiler to compile my .tflite file. However I realized that the compilation does not work if my image input size is too large. The compilation just hangs and gives an internal error without further specifying what error actually occured. (Do you get any further errors during compilation or just a general internal error?)

Did you try a bigger input size with your net (e.g. double your currently used image size) and did it still compile? I fell like there is a limit on the maximum number of nodes/calculations that can be done in one layer. If that is too high it feels like it is not compiling without telling me whats actually wrong. Do you maybe know more about that? And what inference times do you get right now with your current image size?

Regarding your problem I think you should defenitly retrain your model with quantization aware training since it is really not straight forward to do post-quantization.

Good to know I am not the only one having difficulties :)

Heyo marcel1991,

did you make any progress or could you confirm it was the input size of the model that caused your problems? With what input size were you working? I'm currently trying to do a keyword recognizer and am running into the undefined error too. But if I check the input size of a mobilnet.lite model, it's 244x244x3 which is way bigger than the arrays I'm working with (as long as I dont horribly confuse two things here, I had a 2motnhs study break during this project).

Much thanks for an answer! Best regards, Max

Namburger commented 5 years ago

Hi @caiya55 hope you figured out your issues. For future references, the quantization step that you did to convert the graph.pb model to a tflite model does optimization on it so the output of Uint8 form Float32 is correct. As for compiling models, the edgetpu_compiler can now be downloaded and install directly on a host machine so you can just compile your tflite model and scp it to your board. Here a link to install the compiler: https://coral.withgoogle.com/docs/edgetpu/compiler/#download

As for running inference, you can just use one of the open-source demo script and from there modify it to your need: https://coral.googlesource.com/edgetpu/+/refs/heads/release-chef/edgetpu/demo/

Hope this helps

jk78346 commented 4 years ago

I tried to convert mobilenetv2 model into post-training quantized tflite model, and I got the following message:

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 70
Number of operations that will run on CPU: 2

Operator                       Count      Status

ADD                            10         Mapped to Edge TPU
PAD                            5          Mapped to Edge TPU
QUANTIZE                       1          Operation is otherwise supported, but not mapped due to some unspecified limitation
CONV_2D                        35         Mapped to Edge TPU
DEPTHWISE_CONV_2D              17         Mapped to Edge TPU
DEQUANTIZE                     1          Operation is working on an unsupported data type
MEAN                           1          Mapped to Edge TPU
FULLY_CONNECTED                1          Mapped to Edge TPU
SOFTMAX                        1          Mapped to Edge TPU

I can't figure out why only QUANTIZE and DEQUANTIZE operations are not supported for coral dev board.

Here is my python API code:

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimization = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
quant_model = converter.convert()

Maybe it doesn't matter. But later when I run it on edgetpu by using label_image.py example code, it shows the following error:

RuntimeError: Encountered unresolved custom op: edgetpu-custom-op.Node number 1 (edgetpu-custom-op) failed to prepare.

Does anyone have clue which step I took is problematic? thanks.

Namburger commented 4 years ago

@jk78346 what compiler version are you running? Have you tried running visualize tool on your model to see it meets all requirements?

jk78346 commented 4 years ago

@jk78346 what compiler version are you running? Have you tried running visualize tool on your model to see it meets all requirements?

HI, my edgetpu compiler version is: 2.0.267685300 also I use netron to visualize the model and it looks like this: Screen Shot 2019-10-11 at 12 12 47 PM

Namburger commented 4 years ago

Looks like you're up to date on your compiler. Could you try this one, actually? https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/visualize.py The reason why I ask this, is because the out put actually show all ops and tensor inputs.

jk78346 commented 4 years ago

Looks like you're up to date on your compiler. Could you try this one, actually? https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/visualize.py The reason why I ask this, is because the out put actually show all ops and tensor inputs.

Yes, I got the foo.html file and it seems like the same case that only an edgetpu-custom-op in between, no other detail layers displayed.

I think my question is: What is the output type of converter.convert() ? it gives me "bytes" and so far I don't know if this is a valid model for run.

Namburger commented 4 years ago

@jk78346 bytes type is correct for the model, I'm saving mine in such manor:

tflite_model =converter.convert()
out = '/path/to/save/model.tflite'
out.write_bytes(tflite_model)

does foo.html shows any float32 or any unusual types on the "type" column? all the step you've taken so far looks correct to me, how did you train your model?

jk78346 commented 4 years ago

@jk78346 bytes type is correct for the model, I'm saving mine in such manor:

tflite_model =converter.convert()
out = '/path/to/save/model.tflite'
out.write_bytes(tflite_model)

does foo.html shows any float32 or any unusual types on the "type" column? all the step you've taken so far looks correct to me, how did you train your model?

Yes it has float32, hmm.

0 Identity_int8 INT8 [1, 1000] 0 {'zero_point': [-128], 'details_type': 'NONE', 'scale': [0.003906], 'quantized_dimension': 0}
1 input_1_int8 INT8 [1, 224, 224, 3] 0 {'zero_point': [-128], 'details_type': 'NONE', 'scale': [0.003922], 'quantized_dimension': 0}
2 input_1 FLOAT32 [1, 224, 224, 3] 0 {'details_type': 'NONE', 'quantized_dimension': 0}
3 Identity FLOAT32 [1, 1000] 0 {'details_type': 'NONE', 'quantized_dimension': 0}

And I just use this model:

model = tf.keras.applications.MobileNetV2(
    weights="imagenet", input_shape=(224, 224, 3))

So I think here I tried to follow this example and still got the same situation. I can't get ride of the float32 operation.

jk78346 commented 4 years ago

after some search, I'm not sure if the

converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

actually give quantized input/output type.

Because now every intermediate layer seems ok except the input/output layers.

xadrianzetx commented 4 years ago

I've got similar problem as @jk78346

After converting with with flags

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

using latest nightly, tf.lite.Interpreter().get_input_details() yields:

[{'dtype': numpy.float32,
  'index': 302,
  'name': 'input_1',
  'quantization': (0.0, 0),
  'shape': array([  1, 512, 512,   3], dtype=int32)}]

and tf.lite.Interpreter().get_output_details():

[{'dtype': numpy.float32,
  'index': 303,
  'name': 'Identity',
  'quantization': (0.0, 0),
  'shape': array([  1, 512, 512,   1], dtype=int32)}]

However converter still succeeds with log

Edge TPU Compiler version 2.0.267685300
Input: MobileUNetV2.tflite
Output: MobileUNetV2_edgetpu.tflite

Operator                       Count      Status

CONV_2D                        46         Mapped to Edge TPU
CONV_2D                        7          More than one subgraph is not supported
DEPTHWISE_CONV_2D              29         Mapped to Edge TPU
DEPTHWISE_CONV_2D              7          More than one subgraph is not supported
DEQUANTIZE                     1          Operation is working on an unsupported data type
LOGISTIC                       1          More than one subgraph is not supported
ADD                            14         Mapped to Edge TPU
ADD                            2          More than one subgraph is not supported
RESIZE_NEAREST_NEIGHBOR        2          Mapped to Edge TPU
RESIZE_NEAREST_NEIGHBOR        3          Operation is otherwise supported, but not mapped due to some unspecified limitation
PAD                            5          Mapped to Edge TPU
QUANTIZE                       4          Mapped to Edge TPU
QUANTIZE                       1          Operation is otherwise supported, but not mapped due to some unspecified limitation
CONCATENATION                  2          Mapped to Edge TPU
CONCATENATION                  2          More than one subgraph is not supported

But I still get RuntimeError on Edge TPU

RuntimeError: Internal: :68 tf_lite_type != kTfLiteUInt8 (9 != 3)Node number 1 (edgetpu-custom-op) failed to prepare.
Failed to allocate tensors.
mapeima commented 4 years ago

I have found here that TensorFlow 2.0 supports only float input/output. That is the reason why my simple mnist test model compiles like this:

Input: mnist_post_quant_model_io.tflite
Output: mnist_post_quant_model_io_edgetpu.tflite

Operator                       Count      Status

DEQUANTIZE                     1          Operation is working on an unsupported data type
SOFTMAX                        1          Mapped to Edge TPU
FULLY_CONNECTED                2          Mapped to Edge TPU
QUANTIZE                       1          Operation is otherwise supported, but not mapped due to some unspecified limitation

The solution is to downgrade to version 1.15.

jk78346 commented 4 years ago

For preparing .tflite model, I use tf=1.13.1 as well as tflite_convert from command line; when running tf2.0 is used since I want to use delegate.

sdu2011 commented 4 years ago

Yes, as for the UINT8 output problem, it is solved, it seems like the official API will use code to convert your UINT8 output into float32. So finally you will get float32 output. Here is the source code from Coral support team:

const auto& output_indices = interpreter->outputs();
  const int num_outputs = output_indices.size();
  int out_idx = 0;
  for (int i = 0; i < num_outputs; ++i) {
    const auto* out_tensor = interpreter->tensor(output_indices[i]);
    CHECK(out_tensor);
    if (out_tensor->type == kTfLiteUInt8) {
      const int num_values = out_tensor->bytes;
      const uint8_t* output = interpreter->typed_output_tensor<uint8_t>(i);
      CHECK(output);
      for (int j = 0; j < num_values; ++j) {
        output_data[out_idx++] = (output[j] - out_tensor->params.zero_point) *
                                 out_tensor->params.scale;
      }
    } else if (out_tensor->type == kTfLiteFloat32) {
      const int num_values = out_tensor->bytes / sizeof(float);
      const float* output = interpreter->typed_output_tensor<float>(i);
      CHECK(output);
      for (int j = 0; j < num_values; ++j) {
        output_data[out_idx++] = output[j];
      }
    } else {
      LOG(FATAL) << "Tensor " << out_tensor->name
                 << " has unsupported output type: " << out_tensor->type;
    }
    CHECK_LE(out_idx, output_size);
  }

But after that, I got a float32 output but it is not correct. Quickly I get the reason. Since the opt.pb provided by the open-pose GitHub project is not a quantization-Aware Training model, so the output from the open-pose model is very different from the correct output because you have to set the converter.default_ranges_stats when you convert the .pb file into the .tflite file. This is because the max-min numbers are required and the correct min-max numbers are missing from the original .pb file. I try different ranges, such as (0,6), (-1,2), or others, but I can't find a good range to make the final result even close to the correct results. If you can find one, please let me know. The recommended way is to train the open-pose estimation model again with quantization-aware training flag =1.

As for the segmentation fault, sorry I didn't get that error before. Do Do you use the open-pose model from the here? Which model do you use, and does your model pass the Compile your model for the Edge TPU?

Anyway, I'm glad there is someone working on a similar thing, Let's keep sharing information.

老哥中国人?你用这段代码没问题吗?我有一个 quantization-Aware Training model,转成tflite. float和int映射应该是zero_mean=128,scale=1/128 而且我自己写代码加载tflite做分类是没问题的. 但是不知道为什么,在coral上用编译后的edgetpu.tflite做推理

>       for (int j = 0; j < num_values; ++j) {
>         output_data[out_idx++] = (output[j] - out_tensor->params.zero_point) *
>                                  out_tensor->params.scale;

我加了打印,发现out_tensor->params.zero_point=0,out_tensor->params.scale=1/255

Eashwar93 commented 4 years ago

@caiya55 Were you successfully able to convert the openpose model for the edgetpu using the edgetpu compiler. I am curretly trying to do it ,but the edgetpu compiler just abort without any debug info. I downloaded the frozen graph from the same source as you (ildoonet/tf-pose-estimation). I also created a new thread regarding the issue.

If you have successfully converted, then please let me know what I did wrong. Also it would be nice if you could share the converted model, in-case if you are unable to find what is the issue in my code.

caiya55 commented 4 years ago

@Eashwar93 Thanks for asking. I've been trying to convert openpose on coral last year. I've worked for several months, and finally, I succeeded. But the problem is: openpose is too slow for coral due to its pose processing step. The reason is obvious: coral is good for TPU, but I think its CPU is not powerful. The speed that I test is around1.5s/frame, and the post-process step takes 1.3s.

My next step was to convert the post-processing step into C++ and improve the speed. I didn't continue this task, because, at that time, the Google Coral team just release its pose estimation model on coral, which is easy to implement, and its almost real-time (7-9 frame/s). So we just use their model. You can easily find the model from Google Coral official website.

If you are still interested in converting the openpose model on coral, my experience is, using the Keras model! I found that the Keras model is easier to pass the TPU compiler, and less error, when I use this command: tf.compat.v1.lite.TFLiteConverter.from_keras_model_file. So, generally, my steps are, download the ckpt files, establish a Keras model, load the ckpt parameters to each layer of Keras model, and then save the model into h5 file. Trust me, compared with quantization-Aware Training the openpose from scratch, this is easier way. Finally, you can covert the .h5 model to .tflite file.

If you are interested, you can check my colab file here, using colab file to convert is a good way to avoid the configuration and different-version errors.

Eashwar93 commented 4 years ago

@caiya55 Thanks a lot for your detailed explanation. I tried Google's Posenet but I think the accuracy is not that great for a real world robot to use it. I wanted to evaluate if Openpose is offering the appropraite solution. Thanks for the heads up that the Openpose is slow in EdgeTPU. I think I would be next try to write the pose inference in C++ and probably try to find out if there is a way to postprocess using GPU if that would make it real time.

I will try out your convert from keras file colab notebook to see if I'm able to convert if to edgetpu model.

Once again thanks a lot.

caiya55 commented 4 years ago

Do you have wechat, if you have, prviate message to me, and we can discuss on wechat.

获取 Outlook for Androidhttps://aka.ms/ghei36


From: Eashwar notifications@github.com Sent: Thursday, April 23, 2020 4:47:55 PM To: f0cal/google-coral google-coral@noreply.github.com Cc: ZHANG ZHAOXIANG zhangzhaoxiang666@outlook.com; Mention mention@noreply.github.com Subject: Re: [f0cal/google-coral] How to deply my own .Tflite file (#20)

@caiya55https://github.com/caiya55 Thanks a lot for your detailed explanation. I tried Google's Posenet but I think the accuracy is not that great for a real world robot to use it. I wanted to evaluate if Openpose is offering the appropraite solution. Thanks for the heads up that the Openpose is slow in EdgeTPU. I think I would be next try to write the pose inference in C++ and probably try to find out if there is a way to postprocess using GPU if that would make it real time.

I will try out your convert from keras file colab notebook to see if I'm able to convert if to edgetpu model.

Once again thanks a lot.

― You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/f0cal/google-coral/issues/20#issuecomment-618269083, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIBPFZMPZOOBM7X524UXZRLRN76DXANCNFSM4HLF2T5Q.

Eashwar93 commented 4 years ago

@caiya55 thanks but unfortunately I don't use we-chat and even if I create one I don't have a friend here to verify my we chat account through the QR code scanning procedure. Thanks for your help. It does mean a lot.

Ekta246 commented 4 years ago
  • Doc you were trying to follow: coral tflite file
  • Your host OS: Ubuntu
  • Your Python3 version: python 3.6

Now I have my own complied .tflite model, the file is already passed the Compile your model for the Edge TPU, and visualize.py shows that the operators are all become UINT8 type. Now I'd like to know how to deploy the model and make it run in TPU. The python API in the official website provides two engines, one is edgetpu.classification.engine, and another is edgetpu.detection.engine. But my mode is the open-pose (human pose estimation) model, so the output is different.

Is anyone working on the deployment of your own model? I will appreciate that if someone could give me some clues.

Hi @caiya55 I would like to know how were you able to convert your .pb model to the .tflie model. I want to do an inference on the Google Coral for Efficientdet.

Eashwar93 commented 4 years ago

@Ekta246

You can use the code below that was suitable for my model to convert it to a tflite model.

import tensorflow as tf
import numpy as np

def representative_dataset_gen():
for _ in range(100):
fake_image = np.random.random((1,432,368,3)).astype(np.float32)
yield [fake_image]

graph_pb = 'graph_freeze.pb'
inp = ['image']
out = ['Openpose/concat_stage7']
converter=tf.lite.TFLiteConverter.from_frozen_graph(
graph_pb, inp, out,input_shapes={"image":[1,432,368,3]})
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model = converter.convert()

f = open("tflite_model/mobilenet_thin_openpose_opt_fullint_tf1.tflite", "wb")
f.write(tflite_model)

f.close()
print("conversion complete")

you need to modify the code according to your model. You will have to change the arguments graph_pb, inp, out, input_shapes passed into the TFLiteConverter.from_frozen_graph() But I guess you can't run the tflite model directly in a coral Edgetpu. You will have to compile the model with the edgetpu compiler as shown here.

Ekta246 commented 4 years ago

Basically, this is working if you convert your model correctly. But I get weird output now. And one thing that really bothers me is that the type of output in my .tflite file is UINT8, but actually, it is FLOAT32 in .pb file.

I am still working on it. According to the official tflite, their output is FLOAT32, so I am checking the converting way, maybe there are some tricks.

And you can use tensorflow/lite/tools/visualize.py to check the tensors of your tflite model. It is very convenient.

First, which version of the Tensorflow converter did you use for converting the graph to the .tflite model? Second, did you happen to choose the Full Integer post-training quantization option? mentioned in the https://www.tensorflow.org/lite/performance/post_training_quantization

Ekta246 commented 4 years ago

@Ekta246

You can use the code below that was suitable for my model to convert it to a tflite model.

import tensorflow as tf
import numpy as np

def representative_dataset_gen():
for _ in range(100):
fake_image = np.random.random((1,432,368,3)).astype(np.float32)
yield [fake_image]

graph_pb = 'graph_freeze.pb'
inp = ['image']
out = ['Openpose/concat_stage7']
converter=tf.lite.TFLiteConverter.from_frozen_graph(
graph_pb, inp, out,input_shapes={"image":[1,432,368,3]})
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model = converter.convert()

f = open("tflite_model/mobilenet_thin_openpose_opt_fullint_tf1.tflite", "wb")
f.write(tflite_model)

f.close()
print("conversion complete")

you need to modify the code according to your model. You will have to change the arguments graph_pb, inp, out, input_shapes passed into the TFLiteConverter.from_frozen_graph() But I guess you can't run the tflite model directly in a coral Edgetpu. You will have to compile the model with the edgetpu compiler as shown here.

@Eashwar93 thanks for your quick response! First, I have converted the saved_model to the .tflite instead of the graph.pb. I happened to convert it to a quantized .tflite file. After passing it through the edgetpu_compiler it shows "invalid model. Model not quantized" Any help with this.

Eashwar93 commented 4 years ago

@Ekta246 Could you share the codes that you used to convert the saved_model file to a quantised .tflite model?

Ekta246 commented 4 years ago

@Ekta246 Could you share the codes that you used to convert the saved_model file to a quantised .tflite model?

Yes, why not?

import tensorflow as tf saved_model_dir =('./savedmodeldir') saved_model_obj = tf.saved_model.load(export_dir=saved_model_dir) print(saved_model_obj.signatures.keys())

    # Load the specific concrete function from the SavedModel.

concrete_func = saved_model_obj.signatures[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]

    # Set the shape of the input in the concrete function.
    # concrete_func.inputs[0].set_shape([])

    # Convert the model to a TFLite model.

concrete_func = model.signatures[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]

concrete_func.inputs[0].set_shape([1, 512, 512, 3]) converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])

converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.experimental_new_converter = True def representative_datasetgen(): for in range(num_calibration_steps): yield [np.array([[1,512,512,3].astype(np.uint8)])] converter.representative_dataset = representative_dataset_gen

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]

tflite_quant_model = converter.convert()

open("converted.tflite", "wb").write(tflite_quant_model)'''

Ekta246 commented 4 years ago

On using the, converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir) I get some errors like OSError:Saved_model doesn't exist at './path' Cannot solve that issue since long. And get the error as This converter can only convert a single " "ConcreteFunction. Converting multiple functions is " "under development.")

Eashwar93 commented 4 years ago

@Ekta246 I think you are not quantising the input and output layers. I'm not sure if that is supported by the edgetpu compiler yet. In-order to quantise the input and output layers you should probably add

converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

before running

tflite_quant_model = converter.convert()

Also have a look at this visualizer netron to see if quantisation has happened. I'm not an expert either but I think quantising your input and output layers should solve your issue.

Ekta246 commented 4 years ago

Actually, I had already used the inference_input_type =tf.unit8 inference_output_type=tf.unit8

And yes I have already visualised it in the Netron app, and yes I only see w dequantize block. Nothing more which may relate to a quantized model.

Do you mind sharing me your quantized file?

Thanks

On Thu, May 21, 2020, 3:52 PM Eashwar notifications@github.com wrote:

@Ekta246 https://github.com/Ekta246 I think you are not quantising the input and output layers. I'm not sure if that is supported by the edgetpu compiler yet. In-order to quantise the input and output layers you should probably add

converter.inference_input_type = tf.uint8 converter.inference_output_type = tf.uint8

before running

tflite_quant_model = converter.convert()

Also have a look at this visualizer netron https://lutzroeder.github.io/netron/ to see if quantisation has happened. I'm not an expert either but I think quantising your input and output layers should solve your issue.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/f0cal/google-coral/issues/20#issuecomment-632309708, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMQBWSP3PNMXIIHEXLTPI43RSWBARANCNFSM4HLF2T5Q .

Eashwar93 commented 4 years ago

@Ekta246 Oh ok. I could not find that in the code you shared, hence the suggestion. Sure I can share the quantized model. I am facing issues as well with the quantized model when I pass through the EdgeTPU compiler but its a different one. I would like to ask you to look at these issues if it was the same for you. The quantised model is available in both of these issue pages https://github.com/google-coral/edgetpu/issues/100#issue-604992155 https://github.com/tensorflow/tensorflow/issues/38978#issue-608277399

If you could share your model that would be nice as well for me to have a look