Closed PushyamiKaveti closed 4 years ago
The error you encounter will occur with Tensorflow 2.0, too. It occurs, if a convolution layer contains dilation values other than [1, , , 1]. Now resnet50 doesn't contain any layers with dilation values other than [1, 1, 1, 1] (the default), which makes me curious which model you used.
I tried to to find the model in question and I stumbled upon this: Body-Pix
The resnet50 model they use is located at resnet50
Can you confirm that this is the model you used? I downloaded the model and ran it through a small test script and it worked without any issues:
import os
# make tensorflow stop spamming messages
os.environ['TF_CPP_MIN_LOG_LEVEL'] = "3"
import numpy as np
import tensorflow as tf
import tfjs_graph_converter as tfjs
print("Loading resnet50...", end="")
graph = tfjs.api.load_graph_model('~/tfjs/models/resnet50/') # downloaded from the link above
print("done.\nLoading sample image...", end="")
# load sample image into numpy array
img = tf.keras.preprocessing.image.load_img('~/tfjs/assets/sample.jpg')
x = tf.keras.preprocessing.image.img_to_array(img, dtype=np.float32)
# add imagenet mean - extracted from body-pix source
m = np.array([-123.15, -115.90, -103.06])
x = np.add(x, m)
sample_image = x[tf.newaxis, ...]
print("done.\nRunning inference...", end="")
# evaluate the loaded model directly
with tf.compat.v1.Session(graph=graph) as sess:
input_tensor_names = tfjs.util.get_input_tensors(graph)
output_tensor_names = tfjs.util.get_output_tensors(graph)
input_tensor = graph.get_tensor_by_name(input_tensor_names[0])
results = sess.run(output_tensor_names, feed_dict={input_tensor: sample_image})
print("done. {} outputs received".format(len(results))) # should be 8 outputs
Same issue for me. I was trying to use Resnet 50 stride 16 posenet model
Tried running your model evaluation code: tensorflow.python.framework.errors_impl.InvalidArgumentError: Current implementation does not yet support dilations in the batch and depth dimensions.
I am using TF ver 1.15.0 and TFJS 1.4.0, python3.7 on mac
@ajaichemmanam Thanks for providing a link to the model. Without even testing the model, I can tell just from looking at the json that the issue is with these 3 layers:
It seems to me that this particular RESNET50 implementation simply isn't compatible with TF. The TF documentation confirms this: Dilations in the batch and depth dimensions if a 4-d tensor must be 1. This applies to TF 2.0 as well. The aforementioned Conv2D-layers have their dilation values set to [2, 2, 1, 1], which simply isn't compatible with TF 1.15 and TF 2.0 and there's nothing the converter can do about that.
Now here comes the interesting part. It seems as if this isn't even compatible with TFJS: dilationRate: Should be an integer or array of two integers.
@PushyamiKaveti @ajaichemmanam It seems to me as if the Google team messed up the format conversion while converting the model to TFJS. I will add a workaround later today. Expect an updated version in a couple of hours 👍
@patlevin Thanks for the reply and an awesome work. I was also arriving at the same conclusion. The mobilenet version of posenet works well after conversion. So it is indeed the problem with resnet model due to tensorflow changes.
@patlevin Thank you so much for fixing it. Like @ajaichemmanam mentioned I was using the model with stride16 with dilation values of ["2", "2", "1", "1"] which dont appear in the model with stride32.
The fix also worked for me. Thank you @patlevin for this great project!
The error you encounter will occur with Tensorflow 2.0, too. It occurs, if a convolution layer contains dilation values other than [1, , , 1]. Now resnet50 doesn't contain any layers with dilation values other than [1, 1, 1, 1] (the default), which makes me curious which model you used.
I tried to to find the model in question and I stumbled upon this: Body-Pix
The resnet50 model they use is located at resnet50
Can you confirm that this is the model you used? I downloaded the model and ran it through a small test script and it worked without any issues:
import os # make tensorflow stop spamming messages os.environ['TF_CPP_MIN_LOG_LEVEL'] = "3" import numpy as np import tensorflow as tf import tfjs_graph_converter as tfjs print("Loading resnet50...", end="") graph = tfjs.api.load_graph_model('~/tfjs/models/resnet50/') # downloaded from the link above print("done.\nLoading sample image...", end="") # load sample image into numpy array img = tf.keras.preprocessing.image.load_img('~/tfjs/assets/sample.jpg') x = tf.keras.preprocessing.image.img_to_array(img, dtype=np.float32) # add imagenet mean - extracted from body-pix source m = np.array([-123.15, -115.90, -103.06]) x = np.add(x, m) sample_image = x[tf.newaxis, ...] print("done.\nRunning inference...", end="") # evaluate the loaded model directly with tf.compat.v1.Session(graph=graph) as sess: input_tensor_names = tfjs.util.get_input_tensors(graph) output_tensor_names = tfjs.util.get_output_tensors(graph) input_tensor = graph.get_tensor_by_name(input_tensor_names[0]) results = sess.run(output_tensor_names, feed_dict={input_tensor: sample_image}) print("done. {} outputs received".format(len(results))) # should be 8 outputs
Where could you find add imagenet mean - extracted from body-pix source
from source code? And the output part_heatmaps is 16-stride, how can I map to the original size? Is it bilinear interpolation(x16), then softmax and argmax, get the partId result at last?
@DCC-lzhy
Where could you find add imagenet mean - extracted from body-pix source from source code?
It's the preprocessing step of the resnet50-model used by BodyPix
. You can find it right here: resnet (line 22).
And the output part_heatmaps is 16-stride, how can I map to the original size? Is it bilinear interpolation(x16), then softmax and argmax, get the partId result at last?
The decoding process is described in the BodyPix
source code, for example here: decode_single_pose.ts.
Please understand that I cannot answer questions about particular models and how to use them.
The authors have released the source code and detailed descriptions in a blog-post.
Good luck!
@DCC-lzhy
Where could you find add imagenet mean - extracted from body-pix source from source code?
It's the preprocessing step of the resnet50-model used by
BodyPix
. You can find it right here: resnet (line 22).And the output part_heatmaps is 16-stride, how can I map to the original size? Is it bilinear interpolation(x16), then softmax and argmax, get the partId result at last?
The decoding process is described in the
BodyPix
source code, for example here: decode_single_pose.ts. Please understand that I cannot answer questions about particular models and how to use them. The authors have released the source code and detailed descriptions in a blog-post.Good luck!
Thank you very much.
I followed your readme and was able to load_graph_model() for bodypix CNN. however when I try to run the session on a sample image, I get the following error. Current implementation does not yet support dilations in the batch and depth dimensions. [[node resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/Relu (defined at /home/auv/anaconda3/envs/bodypix/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
Is it because of the compatibility sissues between tensorflow1.x and tensorflow2.x? please help