I fine tuned faster_rcnn_resnet50_keras model from the TF object detection API on my own dataset and I used a script (exporter_main_v2) under models/research/object_detection/ to convert one of my checkpoints to a saved model format to serve in a Golang application. I can load the saved model files in Golang using TF Golang client but when I do a forward pass I get the following error:
2021-07-30 15:45:06.876593: I tensorflow/cc/saved_model/loader.cc:303] SavedModel load for tags { serve }; Status: success: OK. Took 1198184 microseconds.
{"severity":"info","timestamp":"2021-07-30T15:45:07.743500777Z","caller":"/go/src/bitbucket.org/ehsai/doc-structure/internal/classifier/object_detection_init.go:30","message":"Loading Tensorflow Model labels: /go/src/bitbucket.org/ehsai/doc-structure/models/classifiers/document_structure/object_detection/v14"}
2021-07-30 15:45:21.226358: E tensorflow/core/framework/tensor.cc:555] Could not decode variant with type_name: "tensorflow::TensorList". Perhaps you forgot to register a decoder via REGISTER_UNARY_VARIANT_DECODE_FUNCTION?
2021-07-30 15:45:21.226403: W tensorflow/core/framework/op_kernel.cc:1744] OP_REQUIRES failed at constant_op.cc:82 : Invalid argument: Cannot parse tensor from tensor_proto.
2021-07-30 15:45:21.255391: E tensorflow/core/framework/tensor.cc:555] Could not decode variant with type_name: "tensorflow::TensorList". Perhaps you forgot to register a decoder via REGISTER_UNARY_VARIANT_DECODE_FUNCTION?
2021-07-30 15:45:21.255454: W tensorflow/core/framework/op_kernel.cc:1744] OP_REQUIRES failed at constant_op.cc:82 : Invalid argument: Cannot parse tensor from proto: dtype: DT_VARIANT
tensor_shape {
}
variant_val {
type_name: "tensorflow::TensorList"
metadata: "\001\000\001\377\377\377\377\377\377\377\377\377\001\030\001"
}
{"severity":"error","timestamp":"2021-07-30T15:45:21.265370446Z","caller":"/go/src/bitbucket.org/ehsai/doc-structure/internal/classifier/object_detection_run.go:135","message":"An error occurred during forwad pass, err=Cannot parse tensor from proto: dtype: DT_VARIANT\ntensor_shape {\n}\nvariant_val {\n type_name: \"tensorflow::TensorList\"\n metadata: \"\\001\\000\\001\\377\\377\\377\\377\\377\\377\\377\\377\\377\\001\\030\\001\"\n}\n\n\t [[{{node StatefulPartitionedCall/StatefulPartitionedCall/map/TensorArrayV2_1/_0__cf__4}}]]"}
suite.go:63: test panicked: runtime error: index out of range [2] with length 0
goroutine 158 [running]:
runtime/debug.Stack(0xc001f95710, 0x9f4bc0, 0xc0000fa000)
/usr/local/go/src/runtime/debug/stack.go:24 +0x9f
github.com/stretchr/testify/suite.failOnPanic(0xc000d03e00)
/go/pkg/mod/github.com/stretchr/testify@v1.7.0/suite/suite.go:63 +0x57
panic(0x9f4bc0, 0xc0000fa000)
/usr/local/go/src/runtime/panic.go:969 +0x175
bitbucket.org/ehsai/doc-structure/internal/classifier.(*objectDetectionModelSuite).Test_modelOutputsShape(0xc0043c60a0)
/go/src/bitbucket.org/ehsai/doc-structure/internal/classifier/object_detection_test.go:146 +0xb45
reflect.Value.call(0xc00440b620, 0xc004408550, 0x13, 0xa376d8, 0x4, 0xc000325e30, 0x1, 0x1, 0xc000325cf8, 0x41142a, ...)
/usr/local/go/src/reflect/value.go:475 +0x8c7
reflect.Value.Call(0xc00440b620, 0xc004408550, 0x13, 0xc000325e30, 0x1, 0x1, 0x24, 0xcf345, 0x519dc4)
/usr/local/go/src/reflect/value.go:336 +0xb9
github.com/stretchr/testify/suite.Run.func1(0xc000d03e00)
/go/pkg/mod/github.com/stretchr/testify@v1.7.0/suite/suite.go:158 +0x379
testing.tRunner(0xc000d03e00, 0xc004155cb0)
/usr/local/go/src/testing/testing.go:1127 +0xef
created by testing.(*T).Run
When I tried to load the saved model files in Python using tf.saved_model.load() I had no problem, I could load the model and run a forward pass and I got the exact same predictions using the checkpoint and the saved model filed so this happens only when I load the model in Golang.
What's wired is that when I loaded the saved model files provided in the object detection model zoo (Pre trained model done by internal people in TF team I guess) I was able to run forward passes in Golang so I think there must be something wrong with the conversion script.
import tensorflow as tf
from PIL import Image, ImageDraw, ImageFont
from six import BytesIO
import numpy as np
from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.builders import model_builder
from object_detection.exporter_lib_v2 import export_inference_graph
import os
from object_detection.protos import pipeline_pb2
from google.protobuf import text_format
pipeline_config_path = path to the pipeline config file available in the downlaoded model folder
pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
with tf.io.gfile.GFile(pipeline_config_path, 'r') as f:
text_format.Merge(f.read(), pipeline_config)
model_dir = path to the checkpoint in the downlaoded model
saved_model_path = a path to save exported saved model files
export_inference_graph("image_tensor", pipeline_config, model_dir, saved_model_path)
Running the above code generates a folder called saved_model in the saved_model_path directory. You need to load the model in Golang and run a forward pass. Here is the code to do that.
I expect to load the saved model files and perform forward passes, I have done this before using TF object detection V1 so this issue happened when I started using TF objection deection V2 to fine a pre trained model on my dataset.
5. Additional context
Include any logs that would be helpful to diagnose the problem.
6. System information
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
TensorFlow installed from (source or binary): binary
Prerequisites
Please answer the following questions for yourself before submitting an issue.
1. The entire URL of the file you are using
https://github.com/tensorflow/models/tree/master/official/...
2. Describe the bug
I fine tuned faster_rcnn_resnet50_keras model from the TF object detection API on my own dataset and I used a script (exporter_main_v2) under
models/research/object_detection/
to convert one of my checkpoints to a saved model format to serve in a Golang application. I can load the saved model files in Golang using TF Golang client but when I do a forward pass I get the following error:When I tried to load the saved model files in Python using tf.saved_model.load() I had no problem, I could load the model and run a forward pass and I got the exact same predictions using the checkpoint and the saved model filed so this happens only when I load the model in Golang.
What's wired is that when I loaded the saved model files provided in the object detection model zoo (Pre trained model done by internal people in TF team I guess) I was able to run forward passes in Golang so I think there must be something wrong with the conversion script.
3. Steps to reproduce
step 1: You need to download a pre trained model from the object detection model zoo (any model) https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md and then you need to use the checkpoint and convert it to a saved model format using the following code:
Running the above code generates a folder called saved_model in the saved_model_path directory. You need to load the model in Golang and run a forward pass. Here is the code to do that.
step 2:
4. Expected behavior
I expect to load the saved model files and perform forward passes, I have done this before using TF object detection V1 so this issue happened when I started using TF objection deection V2 to fine a pre trained model on my dataset.
5. Additional context
Include any logs that would be helpful to diagnose the problem.
6. System information