tensorflow / models

Models and examples built with TensorFlow
Other
77.16k stars 45.76k forks source link

Error quantizing nodes(transform_graph) in rfcn_resnet101_coco #1879

Closed sungsulim closed 6 years ago

sungsulim commented 7 years ago

System information

Describe the problem

I'm trying to quantize the rfcn_resnet101_coco model given in the Tensorflow model zoo (https://github.com/tensorflow/models/blob/master/object_detection/g3doc/detection_model_zoo.md), and I can quantize the model using 'transform_graph' but I get error when trying to do inference.

Source code / logs

Here is the command that I use to do the quantization.

bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph=/home/slim/object_detection/data/rfcn_resnet101_coco_11_06_2017/frozen_inference_graph.pb \
--out_graph=/home/slim/quantization/coco_rfcn_transformed_graph.pb \
--inputs='image_tensor' \
--outputs='detection_boxes,detection_scores,detection_classes,num_detections' \
--transforms='
  add_default_attributes
  strip_unused_nodes(type=float)
  remove_nodes(op=CheckNumerics)
  fold_constants(ignore_errors=true)
  fold_batch_norms
  fold_old_batch_norms
  quantize_weights
  quantize_nodes
  strip_unused_nodes
  sort_by_execution_order'

The following is the code that I use to run inference. It's basically the same example given in jupyter (https://github.com/tensorflow/models/blob/master/object_detection/object_detection_tutorial.ipynb)

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

sys.path.append("..")
sys.path.append("../../")
from utils import label_map_util
from utils import visualization_utils as vis_util

#### FLAGS ###
tf.app.flags.DEFINE_string('input_dir', None, 'Directory where demo images are stored')
tf.app.flags.DEFINE_string('output_dir', None, 'Directory where output images are to be stored')
tf.app.flags.DEFINE_string('ckpt_path', None, 'Path where ckpt is stored')
tf.app.flags.DEFINE_string('label_path', None, 'Path to .pbtxt where correct label mapping is stored')
tf.app.flags.DEFINE_integer('num_classes', None, 'Number of classes')

FLAGS = tf.app.flags.FLAGS

detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(FLAGS.ckpt_path, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

label_map = label_map_util.load_labelmap(FLAGS.label_path)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=FLAGS.num_classes, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)

image_names = sorted(os.listdir(FLAGS.input_dir))

# Make output directory
try:
  os.makedirs(FLAGS.output_dir)
except OSError:
  if not os.path.isdir(FLAGS.output_dir):
    raise

config = tf.ConfigProto()

with detection_graph.as_default():
  with tf.Session(config=config, graph=detection_graph) as sess:

    for name in image_names:
      image_path = FLAGS.input_dir+'/'+name
      save_image_path = FLAGS.output_dir+'/output_'+name

      image = Image.open(image_path)
      # the array based representation of the image will be used later in order to prepare the
      # result image with boxes and labels on it.
      image_np = load_image_into_numpy_array(image)

      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)
      image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')

      # Each box represents a part of the image where a particular object was detected.
      boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
      # Each score represent how level of confidence for each of the objects.
      # Score is shown on the result image, together with the class label.
      scores = detection_graph.get_tensor_by_name('detection_scores:0')
      classes = detection_graph.get_tensor_by_name('detection_classes:0')
      num_detections = detection_graph.get_tensor_by_name('num_detections:0')

      # Actual detection.
      (boxes, scores, classes, num_detections) = sess.run(
          [boxes, scores, classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})

     # Visualization of the results of a detection.
      vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          np.squeeze(boxes),
          np.squeeze(classes).astype(np.int32),
          np.squeeze(scores),
          category_index,
          use_normalized_coordinates=True,
          line_thickness=8)
      im = Image.fromarray(image_np)
      im.save(save_image_path)

Below is the Error message:

Traceback (most recent call last):
  File "demo.py", line 135, in <module>
    feed_dict={image_tensor: image_np_expanded})
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 896, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1124, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1279, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1298, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: The node 'map/while/ToAbsoluteCoordinates/Scale/mul_2_eightbit/map/while/ToAbsoluteCoordinates/Scale/split__port__1/reshape' has inputs from different frames. The input 'map/while/ToAbsoluteCo
ordinates/Scale/split' is in frame 'map/while/map/while/'. The input 'SecondStagePostprocessor/Reshape_2_eightbit/SecondStagePostprocessor/Tile/reshape_dims' is in frame ''.

I tried transform_graph excluding 'quantize_nodes' option and that works fine. I think it has to do with 'quantize_nodes' and 'while' in the model but I'm not sure how to fix it.

I tried the suggestion in https://github.com/tensorflow/tensorflow/issues/7162, and https://github.com/tensorflow/tensorflow/pull/9792 but then I also run into another error in tf.import_graph_def.

Traceback (most recent call last):
  File "demo.py", line 64, in <module>
    tf.import_graph_def(od_graph_def, name='')
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/importer.py", line 369, in import_graph_def
    'Control input %r not found in graph_def.' % (input_name,)))
ValueError: graph_def is invalid at node u'SecondStagePostprocessor/Decode/get_center_coordinates_and_sizes/add_1_eightbit/SecondStagePostprocessor/Decode/get_center_coordinates_and_sizes/unstack__port__1/reduction_dims': Control input '^SecondStagePostprocessor/Decode/get_center_coordinates_and_sizes/unstack:1' not found in graph_def..

Any help is appreciated! Thanks in advance.

sungsulim commented 7 years ago

I tried with different combinations of --transforms options but i get a similar error.

bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph=/home/slim/object_detection/data/rfcn_resnet101_coco_11_06_2017/frozen_inference_graph.pb \
--out_graph=/home/slim/quantization/coco_rfcn_transformed_graph.pb \
--inputs='image_tensor' \
--outputs='detection_boxes,detection_scores,detection_classes,num_detections' \
--transforms='
  add_default_attributes
  quantize_weights
  quantize_nodes
  strip_unused_nodes
  sort_by_execution_order'
Traceback (most recent call last):
  File "demo.py", line 135, in <module>
    feed_dict={image_tensor: image_np_expanded})
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 896, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1124, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1279, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1298, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: The node 'Preprocessor/map/while/ResizeToRange/ResizeBilinear_eightbit/Preprocessor/map/while/ResizeToRange/ExpandDims/reshape' has inputs from different frames. The input 'Preprocessor/map/while/ResizeToRange/ExpandDims' is in frame 'Preprocessor/map/while/Preprocessor/map/while/'. The input 'Preprocessor/map/while/ResizeToRange/mul_eightbit/Preprocessor/map/while/ResizeToRange/ToFloat/reshape_dims' is in frame ''.
cy89 commented 7 years ago

@dreamdragon, it's not clear to me whether your models are intended to be easily quantized. Can you please comment on what's known about doing so?

petewarden commented 7 years ago

Thanks for the reproduction case and good description on this one. We haven't tried quantization with this model, and I suspect that ResNet-style architectures may not tolerate quantization well from an accuracy standpoint (since they're so deep), even if we fix this immediate issue.

Can you give a bit more about your motivation for quantizing in this case? If it's to reduce file size, then quantize_weights may be enough.

sungsulim commented 7 years ago

@petewarden Thank you for your response. My main motivation is not in file size but in faster inference. I've been comparing with SSD, and I wanted to keep the accuracy high while making it faster.

Then do you believe quantization might not work as well on other deep models like Inception_Resnet v2?

h8907283 commented 7 years ago

@petewarden I'm recently trying to do quantization on SSD_MobileNet v1. The source is the frozen graph from the Tensorflow Object Detection API model zoo. I'm using Tensorflow 1.3.1. The transform_graph command is almost exactly as the one at the top of this thread and is adapted from the "8-bit Calculations" in the official doc.

bazel-bin/tensorflow/tools/graph_transforms/transform_graph --in_graph=ssd_mobilenet_v1_coco.pb.original --out_graph=ssd_mobilenet_v1_coco.pb.8bit --inputs='image_tensor' --outputs='detection_boxes,detection_scores,detection_classes,num_detections' --transforms=' add_default_attributes strip_unused_nodes(type=float) remove_nodes(op=CheckNumerics) fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms quantize_weights quantize_nodes strip_unused_nodes sort_by_execution_order'

I tried the quantized model on iOS, Linux x86_64, Raspberry Pi 3. They all failed with the same error:

InvalidArgumentError: The node 'Preprocessor/map/while/ResizeImage/ResizeBilinear/eightbit' has inputs from different frames. The input 'Preprocessor/map/while/ResizeImage/size' is in frame 'Preprocessor/map/while/Preprocessor/map/while/'. The input 'Preprocessor/map/while/ResizeImage/ResizeBilinear_eightbit/Preprocessor/map/while/ResizeImage/ExpandDims/quantize' is in frame ''.

The model runs fine without "quantize_nodes". But using 8-bit weights only is not my goal. I'd like to try 8-bit calculations.

Should I move on from Tensorflow 1.3.1 to try something more recent?

Also, I've seen many examples of "transform_graph" online and they all kinda look different, which are understandable because the command options are very model-specific. But what would be your recommendation for SSD_MobileNet v1?

Thanks!

snownus commented 6 years ago

@h8907283 have u addressed the problem?

h8907283 commented 6 years ago

@snownus Yes and no. I patched a bug in quantize_nodes.cc (from a fix after the release of 1.4.0) and now the runtime stops complaining. However, the inference speed is slower regardless of the platforms:

iOS 11/iPhone 7: 340ms/frame vs 140ms/frame macOS 10.13: 250ms/frame vs 80ms/frame Ubuntu 16.04 x86_64: 330ms/frame vs 100ms/frame Raspberry Pi3: 1.3s/frame vs 1.0s/frame

All platforms use optimized tensorflow runtime (Accelerate.framework on iOS, SSE4.x on x86, neon-fpv4 on Pi 3).

Also, they all gives grossly inaccurate results. My model is SSD_MobileNet pretrained on COCO with no modification or retraining.

My expectation is that this 8bit quantization effort has been going on for quite sometime so the issue that I've been having is likely on my side. I'll keep digging. My next step is to try MobileNet without SSD.

h8907283 commented 6 years ago

I just did another test. I just rebuilt a frozen graph of mobilenet_v1 with ImageNet weights, export_inference_graph in TF-slim and freeze_graph in TF1.4.0. I used label_image and this graph on the Grace Hopper image:

653:military uniform (653): 0.862189 458:bow tie, bow-tie, bowtie (458): 0.0605872 835:suit, suit of clothes (835): 0.0121595 723:ping-pong ball (723): 0.0107614 440:bearskin, busby, shako (440): 0.00682122

Well, no surprise.

I then used transform_graph to change the weights into quantized weight (--transforms='fold_batch_norms fold_old_batch_norms quantize_weights').

653:military uniform (653): 0.763031 458:bow tie, bow-tie, bowtie (458): 0.0794292 835:suit, suit of clothes (835): 0.0416097 723:ping-pong ball (723): 0.0145141 753:racket, racquet (753): 0.01387

The score dropped a bit. But then when I used transform_graph to quantize the nodes (--transforms='add_default_attributes strip_unused_nodes(type=float, shape="1,-1,-1,3") fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms quantize_weights quantize_nodes strip_unused_nodes sort_by_execution_order').

The inference result fell apart:

723:ping-pong ball (723): 0.391782 653:military uniform (653): 0.092184 835:suit, suit of clothes (835): 0.0860384 907:Windsor tie (907): 0.07682 918:comic book (918): 0.0353372

snownus commented 6 years ago

@h8907283 How do you patch the bug after 1.4 release, currently, i still have the errors on runtime. inputs have different frames....

h8907283 commented 6 years ago

https://github.com/tensorflow/tensorflow/commit/17ce98437f34ab5439b3e46adb2eb5b692c48abd

I used the 1.4.0 release and applied the change in the above commit.

snownus commented 6 years ago

So, do you mean you use the branch r.1.4.0, and then only change the above commit, everything works, right?

Warm regards, Xue

On Thu, Nov 9, 2017 at 10:57 AM, h8907283 notifications@github.com wrote:

tensorflow/tensorflow@17ce984 https://github.com/tensorflow/tensorflow/commit/17ce98437f34ab5439b3e46adb2eb5b692c48abd

I used the 1.4.0 release and applied the change in the above commit.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/models/issues/1879#issuecomment-343032467, or mute the thread https://github.com/notifications/unsubscribe-auth/AFXEcaZSgcM-mdKwvs9oO2_KyyxILXfUks5s0mobgaJpZM4OQBwy .

snownus commented 6 years ago

That means, you build from source codes on relase r1.40 with change on above commit, and install it, it works?

snownus commented 6 years ago

I update the source code of tensorflow on branch1.40 with the above commit, and install pip from official site, when I run faster rcnn, it still occurs the error: ValueError: graph_def is invalid at node u'Decode/get_center_coordinates_and_sizes/add_1_eightbit/Decode/get_center_coordinates_and_sizes/unstackport1/reshape_dims': Control input '^Decode/get_center_coordinates_and_sizes/unstack:1' not found in graph_def.

snownus commented 6 years ago

@sungsulim have you addressed the issue get_center_coordinates_and_sizes/unstackport1

h8907283 commented 6 years ago

I update the source code of tensorflow on branch1.40 with the above commit, and install pip from official site,

@snownus I built everything from source, the python binding and the tools. I only tried ssd mobilenet. I didn't try faster rcnn. Maybe the fix didn't fix everything?

snownus commented 6 years ago

@h8907283 I see. Thanks very much for your help.

nmoezzi commented 6 years ago

I have similar issue when quantizing using 'transform_graph'. I used the tool on AlexNet trained on ImageNet. When quantizing only the weights the accuracy is great, when quantizing both weights and nodes, I get "graph_def is invalid" error:

  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/deprecation.py", line 316, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/importer.py", line 406, in import_graph_def
    % (input_name,)))
ValueError: graph_def is invalid at node u'Conv2D_7_eightbit/split_4__port__1/reshape_dims': Control input '^split_4:1' not found in graph_def..

I'm using tensorflow-1.4.0-cp27-cp27mu. Has anyone solved/faced this issue?

Thanks

snownus commented 6 years ago

@nmoezzi Make sure the tensorflow version is consistent in you machine

nmoezzi commented 6 years ago

@snownus I used bazel-build (source code) for creating the quantized graph and python to run the tf inference example. I think the version is consistent. should I run the python example from the source code?

This time instead of "transform_graph" (for weight and node quantization) I used bazel-bin/tensorflow/tools/quantization/quantize_graph and it worked and I am having about 2% accuracy loss for AlexNet. Although some posts have mentioned that "quantize_graph" is obsolete.

sanbeng commented 6 years ago

@snownus @sungsulim I met the same issue,did you adderss thr issue: control input %r not found in graph_def? I update the source code of tensorflow on branch1.40 with the above commit, and install pip from official site, when I run faster rcnn, it still occurs the error: ValueError: graph_def is invalid at node u'Decode/get_center_coordinates_and_sizes/add_1_eightbit/Decode/get_center_coordinates_and_sizes/unstackport1/reshape_dims': Control input '^Decode/get_center_coordinates_and_sizes/unstack:1' not found in graph_def.

nmoezzi commented 6 years ago

This time I tried quantizing the graph based on examples on https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms/#eight-bit-calculations

first quantized inception_v3_2016_08_28_frozen.pb using following: bazel-bin/tensorflow/tools/graph_transforms/transform_graph --in_graph=tensorflow/examples/label_image/data/inception_v3_2016_08_28_frozen.pb --out_graph=tensorflow/examples/label_image/data/inception_v3_2016_08_28_frozen_quantized.pb --inputs="input" --outputs='InceptionV3/Predictions/Reshape_1' --transforms=' add_default_attributes strip_unused_nodes(type=float, shape="1,299,299,3") remove_nodes(op=Identity, op=CheckNumerics) fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms quantize_weights quantize_nodes strip_unused_nodes sort_by_execution_order' --output_as_text=false

and then tried to use the quantized graph for image classification:

bazel-bin/tensorflow/examples/label_image/label_image --image=tensorflow/examples/label_image/data/grace_hopper.jpg --input_layer="input" --output_layer='InceptionV3/Predictions/Reshape_1' --graph=/tmp/logged_quantized_inception.pb --labels=tensorflow/examples/label_image/data/imagenet_slim_labels.txt

I get this error: 2017-11-29 16:41:21.942306: E tensorflow/examples/label_image/main.cc:327] Invalid argument: Node 'InceptionV3/InceptionV3/Conv2d_1a_3x3/BatchNorm/batchnorm/mul_eightbit/input__port__0/reduction_dims': Unknown input node '^input:0'

If I remove 'quantize_nodes' from --transforms options, the test will pass.

This seems to be a bug in tensorflow for 'quantize_nodes'. tensorflow version: 1.4.0-cp27-cp27mu

sanbeng commented 6 years ago

@snownus thanks for your answer.I noted the same problem,when i remove 'quantize_nodes',it worked well on linux(gpu) , the size of .pb turned into 1/4 ,but it did not speed up the inference.And when i removed the quantized .pb to windows(without gpu) ,the .pb did not work and raised error ,someone said that windows does not support quantize,now i was helplessess .

sanbeng commented 6 years ago

@snownus the mode I used was 'faster_rcnn_inception_resnet_v2_atrous_coco ' comes from object detection API,but I retrained it with my data. Another error was that i can not extend 'fold_constants',it would raise another error when import for inference.such like: totalMemory: 10.91GiB freeMemory: 10.64GiB 2017-11-24 13:22:10.619632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1) 1 Fri Nov 24 13:22:12 2017 2017-11-24 13:22:19.280005: E tensorflow/core/framework/op_segment.cc:53] Create kernel failed: Invalid argument: NodeDef mentions attr 'identical_element_shapes' not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: Preprocessor/map/TensorArray = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=, identical_element_shapes=false, tensor_array_name="", _device="/job:localhost/replica:0/task:0/device:GPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.). 2017-11-24 13:22:19.280057: E tensorflow/core/common_runtime/executor.cc:643] Executor failed to create kernel. Invalid argument: NodeDef mentions attr 'identical_element_shapes' not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: Preprocessor/map/TensorArray = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=, identical_element_shapes=false, tensor_array_name="", _device="/job:localhost/replica:0/task:0/device:GPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.). [[Node: Preprocessor/map/TensorArray = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=, identical_element_shapes=false, tensor_array_name="", _device="/job:localhost/replica:0/task:0/device:GPU:0"]] Traceback (most recent call last): File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1323, in _do_call return fn(*args) File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1302, in _run_fn status, run_metadata) File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'identical_element_shapes' not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: Preprocessor/map/TensorArray = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=, identical_element_shapes=false, tensor_array_name="", _device="/job:localhost/replica:0/task:0/device:GPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.). [[Node: Preprocessor/map/TensorArray = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=, identical_element_shapes=false, tensor_array_name="", _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "./detectionImage.py", line 66, in feed_dict={image_tensor: image_np_expanded}) File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 889, in run run_metadata_ptr) File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1120, in _run feed_dict_tensor, options, run_metadata) File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run options, run_metadata) File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1336, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'identical_element_shapes' not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: Preprocessor/map/TensorArray = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=, identical_element_shapes=false, tensor_array_name="", _device="/job:localhost/replica:0/task:0/device:GPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.). [[Node: Preprocessor/map/TensorArray = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=, identical_element_shapes=false, tensor_array_name="", _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

Caused by op 'Preprocessor/map/TensorArray', defined at: File "./detectionImage.py", line 26, in tf.import_graph_def(od_graph_def, name='') File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 313, in import_graph_def op_def=op_def) File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op op_def=op_def) File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1470, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): NodeDef mentions attr 'identical_element_shapes' not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: Preprocessor/map/TensorArray = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=, identical_element_shapes=false, tensor_array_name="", _device="/job:localhost/replica:0/task:0/device:GPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.). [[Node: Preprocessor/map/TensorArray = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=, identical_element_shapes=false, tensor_array_name="", _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

sanbeng commented 6 years ago

@snownus the error if i extended 'quantize_nodes' when i import the quantized .pb was : /home/emg/anaconda3/bin/python ./detectionImage.py Traceback (most recent call last): File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 364, in import_graph_def source_op = name_to_op[input_name[1:]] KeyError: 'Decode/get_center_coordinates_and_sizes/unstack:1'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "./detectionImage.py", line 26, in tf.import_graph_def(od_graph_def, name='') File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 369, in import_graph_def 'Control input %r not found in graph_def.' % (input_name,))) ValueError: graph_def is invalid at node 'Decode/get_center_coordinates_and_sizes/add_1_eightbit/Decode/get_center_coordinates_and_sizes/unstackport1/reshape_dims': Control input '^Decode/get_center_coordinates_and_sizes/unstack:1' not found in graph_def..

this is different with yours, is it because of the different mode?

snownus commented 6 years ago

@sanbeng Can check the tensorflow version, when I update to tensorflow1.4, does't have the error.

snownus commented 6 years ago

@sanbeng , can check the issue and solution here. https://github.com/tensorflow/tensorflow/pull/9792#issuecomment-344129365

sanbeng commented 6 years ago

@snownus I have update to tensorflow1.4,the error still

snownus commented 6 years ago

@sanbeng need to update codes following https://github.com/wodesuck/tensorflow/commit/6c1ab6d34213057f5d70d194094ff48137815ae3

snownus commented 6 years ago

@sanbeng ^Decode cannot be recognized. pls follow wodesuck/tensorflow@6c1ab6d

snownus commented 6 years ago

@sanbeng but still have other issues:

Invalid argument: input_max_range must be larger than input_min_range. [[Node: SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/mul_eightbit/SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/sub_1/quantize = QuantizeV2[T=DT_QUINT8, mode="MIN_FIRST", _device="/job:localhost/replica:0/task:0/device:CPU:0"](SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/sub_1/_817, SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/mul_eightbit/SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/sub_1/min/_819, SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/mul_eightbit/SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow_1/Area/sub_1/max/_821)]] 2017-12-06 11:06:30.110756: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: input_max_range must be larger than input_min_range.

sanbeng commented 6 years ago

@snownus ok,thanks very much,I will tray it ,but why the quantized model cannot run in windows

sanbeng commented 6 years ago

@snownus I quantized the model in ubuntu used gpu,I want run the quantized model on windows with cpu

snownus commented 6 years ago

@sanbeng I am not sure whether the system will affect the quantized model.

sanbeng commented 6 years ago

@snownus I did as you said ,but the new error when I run the quantized model:

emg@emg-200:~/tf$ sudo /home/emg/anaconda3/bin/python ./detectionImage.py Traceback (most recent call last): File "./detectionImage.py", line 24, in tf.import_graph_def(od_graph_def, name='') File "/home/emg/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 338, in import_graph_def op_to_bind_to, node.name)) ValueError: Specified colocation to an op that does not exist during import: SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/map/while/strided_slice in SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/map/while/TensorArrayWrite_4/TensorArrayWriteV3/Enter

snownus commented 6 years ago

@sanbeng , I haven't come across such error. Still now, I cannot run because of tensorflow innerside bugs.

It works now. Can use Tensorflow quantization lib. https://github.com/tensorflow/tensorflow/tree/r1.9/tensorflow/contrib/quantize/python . TF version 1.8

ganeshn85 commented 6 years ago

I still face the same quantization issue( with quantize node as transform) mentioned using resnet with TF 1.8 version. Has this been resolved or planned to be resolved?:

input_max_range must be larger than input_min_range. [[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/mul_eightbit/Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/sub_1/quantize = QuantizeV2[T=DT_QUINT8, mode="MIN_FIRST", round_mode="HALF_AWAY_FROM_ZERO", _device="/job:localhost/replica:0/task:0/device:CPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/sub_1, Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/mul_eightbit/Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/sub_1/min, Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/mul_eightbit/Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_12/Area/sub_1/max)]]

achowdhery commented 6 years ago

Please try this updated set of instructions specifically designed for Mobilenet SSD and could be customized to your need: https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193

achowdhery commented 6 years ago

Is this still an open issue?

snownus commented 6 years ago

@achowdhery I have addressed the issue. Can use the tensorflow quantize lib. https://github.com/tensorflow/tensorflow/tree/r1.9/tensorflow/contrib/quantize/python

snownus commented 6 years ago

@mcfair, can refer to the link I have sent. https://github.com/tensorflow/tensorflow/tree/r1.9/tensorflow/contrib/quantize/python

I use the interface of the latest tensorflow quantization lib. It is okay now.

achowdhery commented 6 years ago

@snownus Thanks. I will close the issue. Please open a new bug if other questions are there.