Converting a custom PyTorch UNet for Fully Quantized TFLite for Edge TPU

kuang-wei commented 3 years ago

First of all, thank you very much for your work. I've found your blog post on PyTorch conversion very informative.

My goal was to use your package to get the model from NCHW into NHWC without having Transpose layers everywhere in my quantized tflite model, so that it can run on an Edge TPU efficiently.

I largely followed your tutorial and was able to do PyTorch -> ONNX -> OpenVINO

However, an error occurs when using openvino2tensorflow (I used your Docker image, so I think dependencies aren't an issue here)

I received this error:

ERROR: Dimension 1 in both shapes must be equal, but are 5 and 4. Shapes are [1,5,64] and [1,4,64]. for '{{node tf.concat/concat}} = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT3
2](Placeholder, Placeholder_1, tf.concat/concat/axis)' with input shapes: [1,5,64,48], [1,4,64,48], [] and with computed input tensors: input[2] = <-1>.
ERROR: model_path  : openvino/unet/fp32/unet_deblur_opt.xml
ERROR: weights_path: openvino/unet/fp32/unet_deblur_opt.bin
ERROR: layer_id    : 52
ERROR: input_layer0 layer_id=32: KerasTensor(type_spec=TensorSpec(shape=(1, 5, 64, 48), dtype=tf.float32, name=None), name='tf.nn.relu_5/Relu:0', description="created by l
ayer 'tf.nn.relu_5'")
ERROR: input_layer1 layer_id=51: KerasTensor(type_spec=TensorSpec(shape=(1, 4, 64, 48), dtype=tf.float32, name=None), name='tf.identity/Identity:0', description="created b
y layer 'tf.identity'")

(side note: this is the most informative error messages I've seen in my journey of trying out various conversion packages)

Based on those messages, I used Netron to inspect the xml file produced by OpenVINO. I was pretty baffled by it, because from the graph itself, it doesn't look like the dimensions are mismatched This is layer 32, simply a ReLU operation (shown at the very top):

And this is layer 51, a Pad operation:

Even looking at the layer where the dimension mismatched supposedly happen, layer 52 a Concat operation, the shapes of the two inputs are correct as [1, 48, 5, 64] and [1, 48, 5, 64], there is no mismatch

Is it possible that somehow the Pad operation isn't running properly? I did find it a little strange that the Pad layer becomes an tf.identity according to the error message ERROR: input_layer1 layer_id=51: KerasTensor(type_spec=TensorSpec(shape=(1, 4, 64, 48), dtype=tf.float32, name=None), name='tf.identity/Identity:0', description="created b y layer 'tf.identity'")

Please let me know if there is anything else you want me to elaborate on, I'm happy to provide more details

One minor note is that I didn't fully follow your blog post for the PyTorch -> ONNX step. Instead of using the backend module of OpenVIO's model downloader, I just did torch.onnx.export on my own, where the hyperparameter settings I used were

export_params=True
do_constant_folding=True
opset_version=11

I used onnxsim to further optimize the ONNX model as well

kuang-wei commented 3 years ago

Update:

I changed my input image size where the height is a multiples of 2, so I don't run into the need of having to run these awkward padding operations

I've verified the saved_model where the input shape is now in NHWC

tf_model = tf.saved_model.load(PATH_TO_SAVED_MODEL)
infer = tf_model.signatures["serving_default"]
print(infer.structured_outputs)

>>> {'tf.identity': TensorSpec(shape=(1, 32, 256, 1), dtype=tf.float32, name='tf.identity')}

In addition, I verified that the saved model produce an output that agrees with PyTorch, all the way down to 1e-5.

So NCHW -> NHWC was a success

However now I'm encountering issues with quantization:

Integer Quantization started ========================================================
ERROR: tensorflow/lite/kernels/conv.cc:349 input->dims->data[3] != filter->dims->data[3] (3 != 1)Node number 1 (CONV_2D) failed to prepare.

Traceback (most recent call last):
  File "/usr/local/bin/openvino2tensorflow", line 2563, in convert
    tflite_model = converter.convert()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/lite.py", line 921, in convert
    result = self._calibrate_quantize_model(result, **flags)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/lite.py", line 522, in _calibrate_quantize_model
    self.representative_dataset.input_gen)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/optimize/calibrator.py", line 172, in calibrate
    self._calibrator.Prepare([list(s.shape) for s in sample])
RuntimeError: tensorflow/lite/kernels/conv.cc:349 input->dims->data[3] != filter->dims->data[3] (3 != 1)Node number 1 (CONV_2D) failed to prepare.

It looks like there is another dimension mismatch. I'm more lost here since I don't know where in the network exactly this is occurring. Would you happen to have some debugging tips?

kuang-wei commented 3 years ago

Was able to perform full integer conversion, and recompile model with edgetpu_compiler, when I fed in my own representative_data

The problem that occurred in my first comment still baffles me, though it's no longer a blocker. Thank you for your work!

kuang-wei commented 3 years ago

Ran into another issue: the quantized TFLite model (I used your package to convert from OpenVINO to TensorFlow saved_model, then convert the saved_model to quantized TFLite) produced very sensible outputs

However, once it's recompiled by edgetpu_compiler the model output becomes completely different from the original quantized tflite model output

I've done the Edge TPU compilation a few times, and have never seen this behavior before.

I know this is pretty decoupled from the original reason that I opened this issue, but if you have seen this kind of problem before please let me know. Thank you!

PINTO0309 commented 3 years ago

It is taking me a long time to reply because there are so many issues to me, including from other repositories.

First, if possible, share the model you are trying to convert. (.onnx) It is necessary to identify the problem areas and test the program after it has been modified.

Optimization to the EdgeTPU compiler is done inside the tool. HardSwish detoxification, ResizeBilinear and NearestNeabhor bug workarounds, standard layer conversion for TFLite_Detection_PostProcess, layer substitution to make it usable in all frameworks except TFLite. It is not appropriate to explain everything here.

It is very difficult to investigate with fragmented information, so please provide a sample model if possible. It may or may not be a problem that can already be solved using the tool's options. If you can't publish publicly, direct mail is fine.

By the way, it's all a hobby. I'm not an expert.

PINTO0309 commented 3 years ago

Closed due to lack of response.

PINTO0309 / openvino2tensorflow

Converting a custom PyTorch UNet for Fully Quantized TFLite for Edge TPU #44