unable to load saved_model.pb from a custom trained model from t2 model zoo: Invalid GraphDef

funkysandman commented 3 years ago

I have trained an object detection model from the tf2 model zoo. I have exported it from a check point to produce a saved model. This leaves me with a saved_model.pb file. When I attempt to load this model using the sample code for object detection, I hit an error at graph.import(pbfilie) : Tensorflow.InvalidArgumentError: 'Invalid GraphDef'

version of Tensorflow.Net is v 0.30.0

zoo model is EfficientDet-d0

mdable2 commented 3 years ago

I am also having the same issue. Here are the details around my problem and what I have tried:

I have trained a model in Python using Tensorflow 2.3 and the Tensorflow Object Detection API. I save the model using the SavedModel format that gives me a .pb file. From there I am trying to use that graph in Tensorflow.NET. To do so I am trying to mimic one of the TensorFlow.NET examples (DetectInMobilenet.cs).

The example starts with Run():

        public bool Run()
        {
            tf.compat.v1.disable_eager_execution();

            Predict();

            return true;
        }

        public Graph ImportGraph()
        {
            var graph = new Graph().as_default();
            graph.Import(Path.Join(modelDir, pbFile));

            return graph;
        }

        public void Predict()
        {
            // read in the input image
            var imgArr = ReadTensorFromImageFile(Path.Join(imageDir, picFile));

            var graph = ImportGraph();

            using (var sess = tf.Session(graph))
            {
                Tensor tensorNum = graph.OperationByName("num_detections");
                Tensor tensorBoxes = graph.OperationByName("detection_boxes");
                Tensor tensorScores = graph.OperationByName("detection_scores");
                Tensor tensorClasses = graph.OperationByName("detection_classes");
                Tensor imgTensor = graph.OperationByName("image_tensor");
                var outTensorArr = new Tensor[] { tensorNum, tensorBoxes, tensorScores, tensorClasses };

                var results = sess.run(outTensorArr, new FeedItem(imgTensor, imgArr));

                BuildOutputImage(results);
            }
        }

This works fine. So then I try to implement this in my solution.

Here is what I have tried so far:

tf.compat.v1.disable_eager_execution();
var graph = new Graph().as_default();
graph.Import(modelPath);

This gives this error: Tensorflow.InvalidArgumentError: 'Invalid GraphDef'. After Googling around, it looks like the format that graph.Import() is looking for is a frozen graph. But since I am using TF 2.3, it no longer supports saving in the frozen format and instead uses the SavedModel format.

Then I tried:

tf_with(Session.LoadFromSavedModel(modelPath), sess =>
            {
                Tensor tensorNum = sess.graph.OperationByName("num_detections");
                Tensor tensorBoxes = sess.graph.OperationByName("detection_boxes");
                Tensor tensorScores = sess.graph.OperationByName("detection_scores");
                Tensor tensorClasses = sess.graph.OperationByName("detection_classes");
                Tensor imgTensor = sess.graph.OperationByName("image_tensor");
                var outTensorArr = new Tensor[] { tensorNum, tensorBoxes, tensorScores, tensorClasses };

                var results = sess.run(outTensorArr, new FeedItem(imgTensor, image));
                return results;
            });

And this loads the model fine, but then says that this line Tensor tensorNum = sess.graph.OperationByName("num_detections"); is throwing an error saying that "num_detections" is not found in the graph.

Thanks!

xsoheilalizadeh commented 3 years ago

I have the same issue when I load a trained model through ML.NET which uses Tensorflow.NET 20.1.

The returned status is TF_INVALID_ARGUMENT.

https://github.com/SciSharp/TensorFlow.NET/blob/ba490102c148ed20c3622d698aecc62f3cb005e7/src/TensorFlowNET.Core/Graphs/Graph.Import.cs#L58

An unhandled exception of type 'System.FormatException' occurred in Microsoft.ML.TensorFlow.dll: 'Tensorflow exception triggered while loading model.'
 Inner exceptions found, see $exception in variables window for more details.
 Innermost exception     Tensorflow.InvalidArgumentError : Invalid GraphDef
   at Tensorflow.Status.Check(Boolean throwException)
   at Tensorflow.Graph.Import(Byte[] bytes, String prefix)
   at Tensorflow.Graph.Import(String file_path, String prefix)
   at Microsoft.ML.TensorFlow.TensorFlowUtils.LoadTFSessionByModelFilePath(IExceptionContext ectx, String modelFile, Boolean metaGraph)

Oceania2018 commented 3 years ago

@xsoheilalizadeh Could you PR a runnable unit test include model file, so we can investigate the issue.

xsoheilalizadeh commented 3 years ago

https://github.com/xsoheilalizadeh/TensorflowIssue @Oceania2018

xsoheilalizadeh commented 3 years ago

@Oceania2018 Do you have any updates?

Oceania2018 commented 3 years ago

@xsoheilalizadeh Are you sure whether it's a valid pb? I can't read it even in python

import tensorflow as tf
from tensorflow.python.platform import gfile
GRAPH_PB_PATH = './data/saved_model.pb'
with tf.compat.v1.Session() as sess:
   print("load graph")
   with tf.io.gfile.GFile(GRAPH_PB_PATH,'rb') as f:
       graph_def = tf.compat.v1.GraphDef()
       graph_def.ParseFromString(f.read())
   sess.graph.as_default()
   tf.import_graph_def(graph_def, name='')
   graph_nodes=[n for n in graph_def.node]
   names = []
   for t in graph_nodes:
      names.append(t.name)
   print(names)

xsoheilalizadeh commented 3 years ago

It's trained by TensorFlow 2.x, I already load it with tensorflow.js

ButterMeWaffle commented 3 years ago

Is there any update on this issue? I too am experiencing it and even after managing to create a frozen graph I cannot get model to load

Exitare commented 3 years ago

Having the same issue too.

Tensorflow.InvalidArgumentError: Invalid GraphDef

bbhxwl commented 3 years ago

me too

captainst commented 3 years ago

The .pb file created using object detection api 2 is a saved_model format, which enables eager mode. I don't think that you can load it using the conventional method. For reference, https://github.com/opencv/opencv/issues/19257, https://github.com/tensorflow/models/issues/8966

ButterMeWaffle commented 3 years ago

The .pb file created using object detection api 2 is a saved_model format, which enables eager mode. I don't think that you can load it using the conventional method. For reference, opencv/opencv#19257, tensorflow/models#8966

Are there any examples of using this saved_model format to detect objects in images? The only examples of this I have found are using a frozen graph with mobilenet or YOLO

captainst commented 3 years ago

The .pb file created using object detection api 2 is a saved_model format, which enables eager mode. I don't think that you can load it using the conventional method. For reference, opencv/opencv#19257, tensorflow/models#8966

Are there any examples of using this saved_model format to detect objects in images? The only examples of this I have found are using a frozen graph with mobilenet or YOLO

I have also been looking for it for some time. To my best knowledge, it is still an un-solved issue. Not about tf.net, but with object detection api 2 itself. If you happen to find some clue, feel free to share it here. Thank you.

bradqiu1982 commented 3 years ago

I meet the same problem. At first , I use TF2.x mobilenet , I can detect object from image, but failed to export frozen model which can be used by OPENCV.

Then, I tried TF1.x mobilenet, the customer trained model failed to detect object from image(waste me two weeks - train,convert,load, test).

Now, I come back to TF2.x object detect, PLAN A, convert to ONNX format and failed on load. PLAN B, try Tensorflow.net , a funny thing is that this library even has no a function to load saved model.

I may need to give up tensorflow and turn to Pytorch to find some luck.

bradqiu1982 commented 3 years ago

A walk-around to use Tensorflow object detect model in C#.

First , using flask and waitress to build a python web service which can load and run tensorflow object detect model: "

import json
from flask import *

cache = dict()

app = Flask(__name__)
def getOBJDectModel(imgtype):
    if imgtype not in cache:
        model = tf.saved_model.load('./tfrec/ObjectDetectModel/'+imgtype+'/saved_model')
        cache[imgtype] = model
        return model
    else:
        return cache[imgtype]

@app.route("/OBJDetect", methods=["POST"])
def OBJDetect():
       ........................
if __name__ == "__main__":
    from waitress import serve
    serve(app, host="0.0.0.0", port=5000)

"

Then call the python web service by http protocol from C#. By this walk-around , we can use the tensorflow object detect AI service from C#

"// C# CODE

//url is http://localhost:5000/OBJDetect
 private string PythonRESTFun(string url,string reqstr)
        {
            string webResponse = string.Empty;
            try
            {
                Uri uri = new Uri(url);
                WebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(uri);
                httpWebRequest.ContentType = "application/json";
                httpWebRequest.Method = "POST";
                using (StreamWriter streamWriter = new StreamWriter(httpWebRequest.GetRequestStream()))
                {
                    streamWriter.Write(reqstr);
                }
                HttpWebResponse httpWebResponse = (HttpWebResponse)httpWebRequest.GetResponse();
                if (httpWebResponse.StatusCode == HttpStatusCode.OK)
                {
                    using (StreamReader streamReader = new StreamReader(httpWebResponse.GetResponseStream()))
                    {
                        webResponse = streamReader.ReadToEnd();
                    }
                }
            }
            catch (Exception ex)
            {
            }

            return webResponse;
        }

"

Hope above code can save your time.

simonbuehler commented 3 years ago

@bradqiu1982 nice idea, i always wanted to avoid the separation but it seems to make life easier sometimes. Question: how did you push e.g the webcam image to the model? did you control the webcam in the python script or push via the webrequest? would be nice if you could answer this and if you are satisfied with performance.

bradqiu1982 commented 3 years ago

@simonbuehler , two ways: 1, save your images to base64 string , wrapper them with Json format and post json object to python service. 2, save the imges to a share folder (one request, one new uniq-id sub-folder), told the python service the share sub-folder name I think the first solution should work for you.

For me, I am using the second solution, I don't have a so real-time requirement.

bradqiu1982 commented 3 years ago

@simonbuehler performance??? When you start to use python and AI, please forget performan.

With traditional algorithm, I can get 30 images per second solve speed( only run cpu). With AI , I just can get 3 images per second solve speed(only run cpu).

So why I choose AI, because it is so smart. With AI , you can write a powerful APP with only 300 lines code. To implement the same function with traditional algorithm , you need 3000 lines code.

For performance, I suggest you buy a latest navida video card and run AI on it. Another suggestion is use small AI model : like mobilenet-v3.

I share my AI progress on linked-in: https://www.linkedin.com/in/brad-qiu-342437103

SciSharp / TensorFlow.NET

unable to load saved_model.pb from a custom trained model from t2 model zoo: Invalid GraphDef #650