Open rbgreenway opened 6 years ago
I have the same problem does anyone knows which operations get the masks and classification results for mrcnn?
I was able to figure out the in/out layer names for the Keras Retinanet implementation using a tool that comes with the Tensorflow source called summarize_graph. This may be too much information, but this is basically the process: 1) get the Tensorflow source from https://github.com/tensorflow/tensorflow 2) install bazel for your system (this is the build tool needed to build summarize_graph). You may need to find instructions for installing bazel for your distro. 3) Navigate to the root of the tensorflow source directory, and then run in a terminal
./configure
bazel build tensorflow/tools/graph_transforms:summarize_graph
4) summarize_graph is located at:
<path/to/TensorflowSource>/tensorflow/bazel-bin/tensorflow/tools/graph_transforms
example run in terminal:
/home/bryan/TFSource/tensorflow/bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph="/home/bryan/retinanet/keras-retinanet/snapshots/test_bryan.pb"
If your .pb file is ready for inference and frozen, then the output will tell you then names of the input and output layers. Here is the relevant part of the output from the command above for a Keras-Retinanet trained network:
Found 1 possible inputs: (name=input_1, type=float(1), shape=[?,?,?,3])
No variables spotted.
Found 3 possible outputs: (name=filtered_detections/map/TensorArrayStack/TensorArrayGatherV3, op=TensorArrayGatherV3) (name=filtered_detections/map/TensorArrayStack_1/TensorArrayGatherV3, op=TensorArrayGatherV3) (name=filtered_detections/map/TensorArrayStack_2/TensorArrayGatherV3, op=TensorArrayGatherV3)
From this, you can see that Input Layer name is "input_1"
Output Layer names are "filtered_detections/map/TensorArrayStack/TensorArrayGatherV3" <-- this is the boxes "filtered_detections/map/TensorArrayStack_1/TensorArrayGatherV3" <-- this is the scores "filtered_detections/map/TensorArrayStack_2/TensorArrayGatherV3" <-- this is the classes
You can see the output layer names are quite complicated (I could have never guessed them).
I know you're working on MRCNN and not RetinaNet, but I hope this helps.
For those interested (judging by the lack of response to my original post, this may be no one), I'll try to put together a complete post of the process for taking the trained .h5 Keras file, converting it to a .pb file, and then using this .pb file with TensorflowSharp. There are lots of little nuances that I had to figure out in order to get it to work properly, but it was worth the effort for me.
Hello @rbgreenway , I have the same problem implementing MaskRcnn frozen graph using TensorFlowSharp, can you please post your approach to implement this?
I've been using TensorflowSharp with Faster RCNN successfully for a while now; however, I recently trained a Retinanet model (using Keras/Python3.5), verified it works in python, and have created a frozen pb file for use with Tensorflow. For FRCNN, there is an example in the TensorflowSharp GitHub repo that shows how to run/fetch this model. For Retinanet, I tried modifying the code but nothing seems to work. I have a model summary for Retinanet that I've tried to work from, but it's not obvious to me what should be used. The problem appears to be the parameters for the "Fetch" portion of the Runner.
For FRCNN, the graph is run in this way:
From the model summary for FRCNN, it is obvious what the input ("image_tensor") and outputs ("detection_boxes", "detection_scores", "detection_classes", and "num_detections") are. They are not the same for Retinanet (I've tried), and I can't figure out what they should be. The "Fetch" part of the code above is causing a crash, and I'm guessing its because I'm not getting the node names right.
I won't paste the entire Retinanet summary here, but here is the first few nodes:
And here are the last several nodes:
Any help with figure out how to fix the "Fetch" part of this would be greatly appreciated.
EDIT:
To dig a little further into this, I found a python function to print the operation names from a .pb file. When doing this for the FRCNN .pb file, it clearly gave the output node names, as can be seen below (only posting the last several lines from the output of the python function).
If I do the same thing for the Retinanet .pb file, it's not obvious what the outputs are. Here's the last several lines from the python function.
For reference, here's the python function that I used:
Hope this helps.
If I can get this working, I'll gladly share my code for training Retinanet in Keras (which is actually transfer learning on my custom objects) and for running the inference of that model in TensorflowSharp. In my python testing, Retinanet clearly outperforms FRCNN.