OAID / Tengine

Tengine is a lite, high performance, modular inference engine for embedded device
Apache License 2.0
4.65k stars 998 forks source link

problem with ResizeNearestNeighbor in tensorflow #264

Open wangg opened 4 years ago

wangg commented 4 years ago

Tensorflow Version: 1.13.2 convert tool: compiled from the master branch of this repo

Observation: It seems that when the input batch size is specified, the conversion of ResizeNearestNeighbor only works if it's followed by a conv layer. Is this an expected behavior?

Update: When batch size is specified, the model can be converted when the resize layer is followed by a conv layer, but Tengine raises an error when running the tmfile model: Tengine/core/lib/graph_executor.cpp:357 infer shape failed on node: conv2d_1/Conv2D due to input: conv2d_1/bias size is zero

This model can be converted to tmfile sucessfully:

inputs = keras.Input(shape=(8, 8, 16), batch_size=1)
x = keras.layers.Conv2D(16, 3, strides=2, padding='same')(inputs)
x = keras.layers.UpSampling2D()(x)
x = keras.layers.Conv2D(16, 1, strides=1, padding='same')(x)
x = keras.layers.Add()([inputs, x])
model = keras.Model(inputs=inputs, outputs=x)

This model can also be converted to tmfile sucessfully:

inputs = keras.Input(shape=(8, 8, 16), batch_size=None)
x = keras.layers.Conv2D(16, 3, strides=2, padding='same')(inputs)
x = keras.layers.UpSampling2D()(x)
x = keras.layers.Add()([inputs, x])
model = keras.Model(inputs=inputs, outputs=x)

However, removing the conv layer after resizing and making the batch size a specific number will raise an error:

inputs = keras.Input(shape=(8, 8, 16), batch_size=1)
x = keras.layers.Conv2D(16, 3, strides=2, padding='same')(inputs)
x = keras.layers.UpSampling2D()(x)
x = keras.layers.Add()([inputs, x])
model = keras.Model(inputs=inputs, outputs=x)

error message: error on load node: add/add op: Add Create graph failed errno: 0

Alicture commented 4 years ago

@wangg Can you share your model with us? It's seems like a bug.

wangg commented 4 years ago

@Alicture This repo has everything to reproduce the problem. model1.py, model2.py, model3.py generate the tensorflow models I mentioned above. convert.sh shows how I call the Python scripts to generate tf models, then apply the convert tool. Both model1 and model2 can be converted sucessfully, but model3 raises an error.

https://github.com/wangg/bug_report

wangg commented 4 years ago

@Alicture

I tried to run the converted model1 and model2.

Model 1 can be converted to tmfile, but Tengine raises an error when running the model:

Tengine/core/lib/graph_executor.cpp:357 infer shape failed on node: conv2d_1/Conv2D due to input: conv2d_1/bias size is zero

It seems that specified batch size + resize somehow caused a problem.

Model 2 can be run successfully.

wangg commented 4 years ago

According to netron's visualization, when batch size is specified, the resize layer suddenly disappears after converted to tmfile.

Model1 (raises runtime error): model1 tmfile

Model2 (runs as expected): model2 tmfile