Closed sungsulim closed 6 years ago
I tried with different combinations of --transforms options but i get a similar error.
bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph=/home/slim/object_detection/data/rfcn_resnet101_coco_11_06_2017/frozen_inference_graph.pb \
--out_graph=/home/slim/quantization/coco_rfcn_transformed_graph.pb \
--inputs='image_tensor' \
--outputs='detection_boxes,detection_scores,detection_classes,num_detections' \
--transforms='
add_default_attributes
quantize_weights
quantize_nodes
strip_unused_nodes
sort_by_execution_order'
Traceback (most recent call last):
File "demo.py", line 135, in <module>
feed_dict={image_tensor: image_np_expanded})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 896, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1279, in _do_run
options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1298, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: The node 'Preprocessor/map/while/ResizeToRange/ResizeBilinear_eightbit/Preprocessor/map/while/ResizeToRange/ExpandDims/reshape' has inputs from different frames. The input 'Preprocessor/map/while/ResizeToRange/ExpandDims' is in frame 'Preprocessor/map/while/Preprocessor/map/while/'. The input 'Preprocessor/map/while/ResizeToRange/mul_eightbit/Preprocessor/map/while/ResizeToRange/ToFloat/reshape_dims' is in frame ''.
@dreamdragon, it's not clear to me whether your models are intended to be easily quantized. Can you please comment on what's known about doing so?
Thanks for the reproduction case and good description on this one. We haven't tried quantization with this model, and I suspect that ResNet-style architectures may not tolerate quantization well from an accuracy standpoint (since they're so deep), even if we fix this immediate issue.
Can you give a bit more about your motivation for quantizing in this case? If it's to reduce file size, then quantize_weights
may be enough.
@petewarden Thank you for your response. My main motivation is not in file size but in faster inference. I've been comparing with SSD, and I wanted to keep the accuracy high while making it faster.
Then do you believe quantization might not work as well on other deep models like Inception_Resnet v2?
@petewarden I'm recently trying to do quantization on SSD_MobileNet v1. The source is the frozen graph from the Tensorflow Object Detection API model zoo. I'm using Tensorflow 1.3.1. The transform_graph command is almost exactly as the one at the top of this thread and is adapted from the "8-bit Calculations" in the official doc.
bazel-bin/tensorflow/tools/graph_transforms/transform_graph --in_graph=ssd_mobilenet_v1_coco.pb.original --out_graph=ssd_mobilenet_v1_coco.pb.8bit --inputs='image_tensor' --outputs='detection_boxes,detection_scores,detection_classes,num_detections' --transforms=' add_default_attributes strip_unused_nodes(type=float) remove_nodes(op=CheckNumerics) fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms quantize_weights quantize_nodes strip_unused_nodes sort_by_execution_order'
I tried the quantized model on iOS, Linux x86_64, Raspberry Pi 3. They all failed with the same error:
InvalidArgumentError: The node 'Preprocessor/map/while/ResizeImage/ResizeBilinear/eightbit' has inputs from different frames. The input 'Preprocessor/map/while/ResizeImage/size' is in frame 'Preprocessor/map/while/Preprocessor/map/while/'. The input 'Preprocessor/map/while/ResizeImage/ResizeBilinear_eightbit/Preprocessor/map/while/ResizeImage/ExpandDims/quantize' is in frame ''.
The model runs fine without "quantize_nodes". But using 8-bit weights only is not my goal. I'd like to try 8-bit calculations.
Should I move on from Tensorflow 1.3.1 to try something more recent?
Also, I've seen many examples of "transform_graph" online and they all kinda look different, which are understandable because the command options are very model-specific. But what would be your recommendation for SSD_MobileNet v1?
Thanks!
@h8907283 have u addressed the problem?
@snownus Yes and no. I patched a bug in quantize_nodes.cc (from a fix after the release of 1.4.0) and now the runtime stops complaining. However, the inference speed is slower regardless of the platforms:
iOS 11/iPhone 7: 340ms/frame vs 140ms/frame macOS 10.13: 250ms/frame vs 80ms/frame Ubuntu 16.04 x86_64: 330ms/frame vs 100ms/frame Raspberry Pi3: 1.3s/frame vs 1.0s/frame
All platforms use optimized tensorflow runtime (Accelerate.framework on iOS, SSE4.x on x86, neon-fpv4 on Pi 3).
Also, they all gives grossly inaccurate results. My model is SSD_MobileNet pretrained on COCO with no modification or retraining.
My expectation is that this 8bit quantization effort has been going on for quite sometime so the issue that I've been having is likely on my side. I'll keep digging. My next step is to try MobileNet without SSD.
I just did another test. I just rebuilt a frozen graph of mobilenet_v1 with ImageNet weights, export_inference_graph in TF-slim and freeze_graph in TF1.4.0. I used label_image and this graph on the Grace Hopper image:
653:military uniform (653): 0.862189 458:bow tie, bow-tie, bowtie (458): 0.0605872 835:suit, suit of clothes (835): 0.0121595 723:ping-pong ball (723): 0.0107614 440:bearskin, busby, shako (440): 0.00682122
Well, no surprise.
I then used transform_graph to change the weights into quantized weight (--transforms='fold_batch_norms fold_old_batch_norms quantize_weights').
653:military uniform (653): 0.763031 458:bow tie, bow-tie, bowtie (458): 0.0794292 835:suit, suit of clothes (835): 0.0416097 723:ping-pong ball (723): 0.0145141 753:racket, racquet (753): 0.01387
The score dropped a bit. But then when I used transform_graph to quantize the nodes (--transforms='add_default_attributes strip_unused_nodes(type=float, shape="1,-1,-1,3") fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms quantize_weights quantize_nodes strip_unused_nodes sort_by_execution_order').
The inference result fell apart:
723:ping-pong ball (723): 0.391782 653:military uniform (653): 0.092184 835:suit, suit of clothes (835): 0.0860384 907:Windsor tie (907): 0.07682 918:comic book (918): 0.0353372
@h8907283 How do you patch the bug after 1.4 release, currently, i still have the errors on runtime. inputs have different frames....
https://github.com/tensorflow/tensorflow/commit/17ce98437f34ab5439b3e46adb2eb5b692c48abd
I used the 1.4.0 release and applied the change in the above commit.
So, do you mean you use the branch r.1.4.0, and then only change the above commit, everything works, right?
Warm regards, Xue
On Thu, Nov 9, 2017 at 10:57 AM, h8907283 notifications@github.com wrote:
tensorflow/tensorflow@17ce984 https://github.com/tensorflow/tensorflow/commit/17ce98437f34ab5439b3e46adb2eb5b692c48abd
I used the 1.4.0 release and applied the change in the above commit.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/models/issues/1879#issuecomment-343032467, or mute the thread https://github.com/notifications/unsubscribe-auth/AFXEcaZSgcM-mdKwvs9oO2_KyyxILXfUks5s0mobgaJpZM4OQBwy .
That means, you build from source codes on relase r1.40 with change on above commit, and install it, it works?
I update the source code of tensorflow on branch1.40 with the above commit, and install pip from official site, when I run faster rcnn, it still occurs the error: ValueError: graph_def is invalid at node u'Decode/get_center_coordinates_and_sizes/add_1_eightbit/Decode/get_center_coordinates_and_sizes/unstackport1/reshape_dims': Control input '^Decode/get_center_coordinates_and_sizes/unstack:1' not found in graph_def.
@sungsulim have you addressed the issue get_center_coordinates_and_sizes/unstackport1
I update the source code of tensorflow on branch1.40 with the above commit, and install pip from official site,
@snownus I built everything from source, the python binding and the tools. I only tried ssd mobilenet. I didn't try faster rcnn. Maybe the fix didn't fix everything?
@h8907283 I see. Thanks very much for your help.
I have similar issue when quantizing using 'transform_graph'. I used the tool on AlexNet trained on ImageNet. When quantizing only the weights the accuracy is great, when quantizing both weights and nodes, I get "graph_def is invalid" error:
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/deprecation.py", line 316, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/importer.py", line 406, in import_graph_def
% (input_name,)))
ValueError: graph_def is invalid at node u'Conv2D_7_eightbit/split_4__port__1/reshape_dims': Control input '^split_4:1' not found in graph_def..
I'm using tensorflow-1.4.0-cp27-cp27mu. Has anyone solved/faced this issue?
Thanks
@nmoezzi Make sure the tensorflow version is consistent in you machine
@snownus I used bazel-build (source code) for creating the quantized graph and python to run the tf inference example. I think the version is consistent. should I run the python example from the source code?
This time instead of "transform_graph" (for weight and node quantization) I used bazel-bin/tensorflow/tools/quantization/quantize_graph and it worked and I am having about 2% accuracy loss for AlexNet. Although some posts have mentioned that "quantize_graph" is obsolete.
@snownus @sungsulim I met the same issue,did you adderss thr issue: control input %r not found in graph_def? I update the source code of tensorflow on branch1.40 with the above commit, and install pip from official site, when I run faster rcnn, it still occurs the error: ValueError: graph_def is invalid at node u'Decode/get_center_coordinates_and_sizes/add_1_eightbit/Decode/get_center_coordinates_and_sizes/unstackport1/reshape_dims': Control input '^Decode/get_center_coordinates_and_sizes/unstack:1' not found in graph_def.
This time I tried quantizing the graph based on examples on https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms/#eight-bit-calculations
first quantized inception_v3_2016_08_28_frozen.pb using following:
bazel-bin/tensorflow/tools/graph_transforms/transform_graph --in_graph=tensorflow/examples/label_image/data/inception_v3_2016_08_28_frozen.pb --out_graph=tensorflow/examples/label_image/data/inception_v3_2016_08_28_frozen_quantized.pb --inputs="input" --outputs='InceptionV3/Predictions/Reshape_1' --transforms=' add_default_attributes strip_unused_nodes(type=float, shape="1,299,299,3") remove_nodes(op=Identity, op=CheckNumerics) fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms quantize_weights quantize_nodes strip_unused_nodes sort_by_execution_order' --output_as_text=false
and then tried to use the quantized graph for image classification:
bazel-bin/tensorflow/examples/label_image/label_image --image=tensorflow/examples/label_image/data/grace_hopper.jpg --input_layer="input" --output_layer='InceptionV3/Predictions/Reshape_1' --graph=/tmp/logged_quantized_inception.pb --labels=tensorflow/examples/label_image/data/imagenet_slim_labels.txt
I get this error:
2017-11-29 16:41:21.942306: E tensorflow/examples/label_image/main.cc:327] Invalid argument: Node 'InceptionV3/InceptionV3/Conv2d_1a_3x3/BatchNorm/batchnorm/mul_eightbit/input__port__0/reduction_dims': Unknown input node '^input:0'
If I remove 'quantize_nodes' from --transforms options, the test will pass.
This seems to be a bug in tensorflow for 'quantize_nodes'. tensorflow version: 1.4.0-cp27-cp27mu
@snownus thanks for your answer.I noted the same problem,when i remove 'quantize_nodes',it worked well on linux(gpu) , the size of .pb turned into 1/4 ,but it did not speed up the inference.And when i removed the quantized .pb to windows(without gpu) ,the .pb did not work and raised error ,someone said that windows does not support quantize,now i was helplessess .
@snownus the mode I used was 'faster_rcnn_inception_resnet_v2_atrous_coco ' comes from object detection API,but I retrained it with my data. Another error was that i can not extend 'fold_constants',it would raise another error when import for inference.such like:
totalMemory: 10.91GiB freeMemory: 10.64GiB
2017-11-24 13:22:10.619632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1)
1 Fri Nov 24 13:22:12 2017
2017-11-24 13:22:19.280005: E tensorflow/core/framework/op_segment.cc:53] Create kernel failed: Invalid argument: NodeDef mentions attr 'identical_element_shapes' not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./detectionImage.py", line 66, in
Caused by op 'Preprocessor/map/TensorArray', defined at:
File "./detectionImage.py", line 26, in
InvalidArgumentError (see above for traceback): NodeDef mentions attr 'identical_element_shapes' not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=
@snownus the error if i extended 'quantize_nodes' when i import the quantized .pb was : /home/emg/anaconda3/bin/python ./detectionImage.py Traceback (most recent call last): File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 364, in import_graph_def source_op = name_to_op[input_name[1:]] KeyError: 'Decode/get_center_coordinates_and_sizes/unstack:1'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./detectionImage.py", line 26, in
this is different with yours, is it because of the different mode?
@sanbeng Can check the tensorflow version, when I update to tensorflow1.4, does't have the error.
@sanbeng , can check the issue and solution here. https://github.com/tensorflow/tensorflow/pull/9792#issuecomment-344129365
@snownus I have update to tensorflow1.4,the error still
@sanbeng need to update codes following https://github.com/wodesuck/tensorflow/commit/6c1ab6d34213057f5d70d194094ff48137815ae3
@sanbeng ^Decode cannot be recognized. pls follow wodesuck/tensorflow@6c1ab6d
@sanbeng but still have other issues:
Invalid argument: input_max_range must be larger than input_min_range. [[Node: SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/mul_eightbit/SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/sub_1/quantize = QuantizeV2[T=DT_QUINT8, mode="MIN_FIRST", _device="/job:localhost/replica:0/task:0/device:CPU:0"](SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/sub_1/_817, SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/mul_eightbit/SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/sub_1/min/_819, SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/mul_eightbit/SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/sub_1/max/_821)]] 2017-12-06 11:06:30.110756: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: input_max_range must be larger than input_min_range.
@snownus ok,thanks very much,I will tray it ,but why the quantized model cannot run in windows
@snownus I quantized the model in ubuntu used gpu,I want run the quantized model on windows with cpu
@sanbeng I am not sure whether the system will affect the quantized model.
@snownus I did as you said ,but the new error when I run the quantized model:
emg@emg-200:~/tf$ sudo /home/emg/anaconda3/bin/python ./detectionImage.py
Traceback (most recent call last):
File "./detectionImage.py", line 24, in
@sanbeng , I haven't come across such error. Still now, I cannot run because of tensorflow innerside bugs.
It works now. Can use Tensorflow quantization lib. https://github.com/tensorflow/tensorflow/tree/r1.9/tensorflow/contrib/quantize/python . TF version 1.8
I still face the same quantization issue( with quantize node as transform) mentioned using resnet with TF 1.8 version. Has this been resolved or planned to be resolved?:
input_max_range must be larger than input_min_range. [[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/mul_eightbit/Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/sub_1/quantize = QuantizeV2[T=DT_QUINT8, mode="MIN_FIRST", round_mode="HALF_AWAY_FROM_ZERO", _device="/job:localhost/replica:0/task:0/device:CPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/sub_1, Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/mul_eightbit/Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/sub_1/min, Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/mul_eightbit/Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/sub_1/max)]]
Please try this updated set of instructions specifically designed for Mobilenet SSD and could be customized to your need: https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193
Is this still an open issue?
@achowdhery I have addressed the issue. Can use the tensorflow quantize lib. https://github.com/tensorflow/tensorflow/tree/r1.9/tensorflow/contrib/quantize/python
@mcfair, can refer to the link I have sent. https://github.com/tensorflow/tensorflow/tree/r1.9/tensorflow/contrib/quantize/python
I use the interface of the latest tensorflow quantization lib. It is okay now.
@snownus Thanks. I will close the issue. Please open a new bug if other questions are there.
System information
Describe the problem
I'm trying to quantize the rfcn_resnet101_coco model given in the Tensorflow model zoo (https://github.com/tensorflow/models/blob/master/object_detection/g3doc/detection_model_zoo.md), and I can quantize the model using 'transform_graph' but I get error when trying to do inference.
Source code / logs
Here is the command that I use to do the quantization.
The following is the code that I use to run inference. It's basically the same example given in jupyter (https://github.com/tensorflow/models/blob/master/object_detection/object_detection_tutorial.ipynb)
Below is the Error message:
I tried transform_graph excluding 'quantize_nodes' option and that works fine. I think it has to do with 'quantize_nodes' and 'while' in the model but I'm not sure how to fix it.
I tried the suggestion in https://github.com/tensorflow/tensorflow/issues/7162, and https://github.com/tensorflow/tensorflow/pull/9792 but then I also run into another error in tf.import_graph_def.
Any help is appreciated! Thanks in advance.