google-coral / edgetpu

Coral issue tracker (and legacy Edge TPU API source)
https://coral.ai
Apache License 2.0
424 stars 125 forks source link

Edgetpucompiler fails to transfer to edgetpu if there is an Add operation after a Pad operation #690

Closed mathieugalle closed 1 year ago

mathieugalle commented 1 year ago

Description

Hi,

The edge TPU compiler does not compile my model entirely if a "Pad" operation is present just before an "Add" operation. I think that this is not an issue with the tensor dimensions because the working model has a bigger input shape than the failing model. If I remove the "Pad" operation by resizing my input shape, then everything compiles.

The documentation about supported operations says that all Pad and Add operations are supported.

I reproduced the code on a repository. Change the boolean padded = True in the jupyter notebook to generate a model which fails to compile. I included the tflite files before and after compilation for both versions, edgetpucompiler outputs and netron captures.

In short, this fails :

netron_capture_padded

The edgetpu graph has multiple subgraphs :

netron_capture_padded_edgetpu

But this works :

netron_capture_unpadded netron_capture_not_padded_edgetpu

Longer story :

I try to implement an object-detector from scratch, without using the google object detection api, with Keras. My goal is to be able to customize the architecture entirely and to be sure to understand everything. Because this is error-prone, this work is heavily inspired (= mostly copy-pasted) by the keras retina net tutorial implementation. Instead of using Resnet as a backbone, I modified the code to use Mobile-Net-V2 (with existing weights, which could save me training time and pain). I also changed the input tensor format to be (120, 160, 3) because this is my camera setup.

In the FPN architecture, you upscale*2 a feature map of a certain size, then add it to the next feature map. To be able to do this, the feature map shapes must be doubles, like (8,10,3) => (4,5,3). Because of my input size, this was not the case (height dimension was not divisible by enough powers of 2) so I added a Pad operation to be able to use the FPN architecture. More precisely, I had to ADD a (None, 15, 20, 16) with a (None, 16, 20, 16), so I used a tf.keras.layers.ZeroPadding2D(padding=((1, 0), (0,0))).

The "failing" model, with pad, has an input shape of (120, 160, 3), while the working model has an input shape of (124, 160, 3) and thus does not need a Pad operation. Unless I'm wrong, this is the only difference between the two models.

Could you confirm that this is a bug ? If yes, would it be possible to investigate this or signal it somewhere on the documentation ? Thanks !

Click to expand! ### Issue Type Bug ### Operating System Linux ### Coral Device USB Accelerator ### Other Devices _No response_ ### Programming Language Python 3.9 ### Relevant Log Output _No response_
hjonnala commented 1 year ago

Please try changing this line as p3_output_padded = tf.keras.layers.ZeroPadding2D(padding=((0, 1), (0,0)))(p3_output). You would be able to map all ops to TPU..

mathieugalle commented 1 year ago

Wow, your line works, thank you ! I did not think that padding "bottom" instead of "top" would change anything, since the resulting tensor has exactly the same shape. I padded "top" because the objects I try to detect are located on the ground, rarely above the horizon line.

Out of curiosity, why does it behave differently ? The documentation says that all paddings should work. Is it easier to add values at the end of the tensor rather than at the beginning (then moving everything) ?

google-coral-bot[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No