tensorflow / models

Models and examples built with TensorFlow
Other
76.97k stars 45.79k forks source link

[Object Detection] InvalidArgumentError: Expected size[0] in [0, 100], but got 109 #4981

Closed yinguobing closed 6 years ago

yinguobing commented 6 years ago

System information

You can collect some of this information using our environment capture script:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

tf_env.txt

Describe the problem

I guess this is a bug. Using legacy/train.py the training process goes well, but failed with model_main.py.

The error message is like: InvalidArgumentError (see above for traceback): Expected size[0] in [0, 100], but got 109

It's not always 109, the number varies in defferent running.

Source code / logs

The commit ID is: 02a9969e94feb51966f9bacddc1836d811f8ce69

Logs

/opt/models/research/object_detection/utils/visualization_utils.py:25: UserWarning: 
This call to matplotlib.use() has no effect because the backend has already
been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
or matplotlib.backends is imported for the first time.

The backend was *originally* set to 'TkAgg' by the following code:
  File "object_detection/model_main.py", line 26, in <module>
    from object_detection import model_lib
  File "/opt/models/research/object_detection/model_lib.py", line 26, in <module>
    from object_detection import eval_util
  File "/opt/models/research/object_detection/eval_util.py", line 28, in <module>
    from object_detection.metrics import coco_evaluation
  File "/opt/models/research/object_detection/metrics/coco_evaluation.py", line 20, in <module>
    from object_detection.metrics import coco_tools
  File "/opt/models/research/object_detection/metrics/coco_tools.py", line 47, in <module>
    from pycocotools import coco
  File "/opt/models/research/pycocotools/coco.py", line 49, in <module>
    import matplotlib.pyplot as plt
  File "/home/robin/.local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 71, in <module>
    from matplotlib.backends import pylab_setup
  File "/home/robin/.local/lib/python2.7/site-packages/matplotlib/backends/__init__.py", line 16, in <module>
    line for line in traceback.format_stack()

  import matplotlib; matplotlib.use('Agg')  # pylint: disable=multiple-statements
WARNING:tensorflow:Estimator's model_fn (<function model_fn at 0x7f646c4b7e60>) includes params argument, but params are not passed to Estimator.
WARNING:tensorflow:num_readers has been reduced to 10 to match input file shards.
WARNING:tensorflow:From /opt/models/research/object_detection/core/preprocessor.py:1205: calling squeeze (from tensorflow.python.ops.array_ops) with squeeze_dims is deprecated and will be removed in a future version.
Instructions for updating:
Use the `axis` argument instead
2018-08-02 15:29:10.072903: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
  File "object_detection/model_main.py", line 101, in <module>
    tf.app.run()
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "object_detection/model_main.py", line 97, in main
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/training.py", line 447, in train_and_evaluate
    return executor.run()
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/training.py", line 531, in run
    return self.run_local()
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/training.py", line 669, in run_local
    hooks=train_hooks)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 366, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1119, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1135, in _train_model_default
    saving_listeners)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1336, in _train_with_estimator_spec
    _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 577, in run
    run_metadata=run_metadata)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 1053, in run
    run_metadata=run_metadata)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 1144, in run
    raise six.reraise(*original_exc_info)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 1129, in run
    return self._sess.run(*args, **kwargs)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 1201, in run
    run_metadata=run_metadata)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 981, in run
    return self._sess.run(*args, **kwargs)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 900, in run
    run_metadata_ptr)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1135, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
    run_metadata)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected size[0] in [0, 100], but got 109
     [[Node: Slice_83 = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](unstack_4:11, zeros_48, stack_83)]]

Caused by op u'Slice_83', defined at:
  File "object_detection/model_main.py", line 101, in <module>
    tf.app.run()
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "object_detection/model_main.py", line 97, in main
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/training.py", line 447, in train_and_evaluate
    return executor.run()
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/training.py", line 531, in run
    return self.run_local()
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/training.py", line 669, in run_local
    hooks=train_hooks)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 366, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1119, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1132, in _train_model_default
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1107, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/opt/models/research/object_detection/model_lib.py", line 216, in model_fn
    unpad_groundtruth_tensors=train_config.unpad_groundtruth_tensors)
  File "/opt/models/research/object_detection/model_lib.py", line 163, in unstack_batch
    unpadded_tensor = tf.slice(padded_tensor, slice_begin, slice_size)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 576, in slice
    return gen_array_ops._slice(input_, begin, size, name=name)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 7177, in _slice
    "Slice", input=input, begin=begin, size=size, name=name)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3414, in create_op
    op_def=op_def)
  File "/home/robin/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1740, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Expected size[0] in [0, 100], but got 109
     [[Node: Slice_83 = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](unstack_4:11, zeros_48, stack_83)]]

Config

# SSD with Mobilenet v2 configuration for WIDERFACE Dataset.
model {
  ssd {
    num_classes: 1

    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }

    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }

    similarity_calculator {
      iou_similarity {
      }
    }

    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }

    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }

    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 3
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    }

    feature_extractor {
      type: 'ssd_mobilenet_v2'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
      use_depthwise: true
    }

    loss {
      classification_loss {
        weighted_sigmoid_focal {
          gamma: 2.0
          alpha: 0.75
        }
      }
      localization_loss {
        weighted_smooth_l1 {

        }
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

# Configuration for training.
train_config: {
  batch_size: 24
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  # NOT USING TRANSFER LEARNING
  # fine_tune_checkpoint: "/home/ubuntu/face_ssd_mobilenet_v2/model/restore/model.ckpt"
  # fine_tune_checkpoint_type:  "detection"
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

# Configuration for training input.
train_input_reader: {
  tf_record_input_reader {
    input_path: "/home/ubuntu/face_ssd_mobilenet_v2/data/wider_train.record-?????-of-00010"
  }
  label_map_path: "/home/ubuntu/face_ssd_mobilenet_v2/model/configs/label_map.pbtxt"
}

# Configuration for evaluation.
eval_config: {
  num_examples: 8000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

# Configuration for evaluation input.
eval_input_reader: {
  tf_record_input_reader {
    input_path: "/home/ubuntu/face_ssd_mobilenet_v2/data/wider_val.record-?????-of-00010"
  }
  label_map_path: "/home/ubuntu/face_ssd_mobilenet_v2/model/configs/label_map.pbtxt"
  shuffle: false
  num_readers: 1
}

I also uploaded a TFRecord file here: https://drive.google.com/open?id=1NtNA1LefRYGSbRwTiahO_PrHfSjrR4mU

Kuanch commented 6 years ago

I encountered the same issue after latest object detection update and also on wider face dataset. At beginning I doubted that caused by max_detections in config but nothing changed after update those.

update part of tfrecord I manually produce , this is one image with 10 bbox and one class :

{
  "features": {
    "feature": {
      "image/object/bbox/xmax": {
        "floatList": {
          "value": [
            0.142578125, 
            0.48046875, 
            0.3330078125, 
            0.1513671875, 
            0.6669921875, 
            0.9248046875, 
            0.525390625, 
            0.40625, 
            0.8046875, 
            0.0791015625
          ]
        }
      }, 
      "image/height": {
        "int64List": {
          "value": [
            "683"
          ]
        }
      }, 
      "image/format": {
        "bytesList": {
          "value": [
            "anBlZw=="
          ]
        }
      }, 
      "image/encoded": {
        "bytesList": {
          "value": [
            "........(skip the image encode)"
          ]
        }
      }, 
      "image/object/bbox/ymax": {
        "floatList": {
          "value": [
            0.46412885189056396, 
            0.4011712968349457, 
            0.3572474420070648, 
            0.3382137715816498, 
            0.2986822724342346, 
            0.4026354253292084, 
            0.37774524092674255, 
            0.34992679953575134, 
            0.3953147828578949, 
            0.3103953003883362
          ]
        }
      }, 
      "image/object/class/text": {
        "bytesList": {
          "value": [
            "RmFjZQ==", 
            "RmFjZQ==", 
            "RmFjZQ==", 
            "RmFjZQ==", 
            "RmFjZQ==", 
            "RmFjZQ==", 
            "RmFjZQ==", 
            "RmFjZQ==", 
            "RmFjZQ==", 
            "RmFjZQ=="
          ]
        }
      }, 
      "image/object/bbox/ymin": {
        "floatList": {
          "value": [
            0.3250366151332855, 
            0.28257685899734497, 
            0.26207906007766724, 
            0.2430453896522522, 
            0.19326500594615936, 
            0.22254759073257446, 
            0.3001464009284973, 
            0.2796486020088196, 
            0.3060029149055481, 
            0.2708638310432434
          ]
        }
      }, 
      "image/onject/class/label": {
        "int64List": {
          "value": [
            "1", 
            "1", 
            "1", 
            "1", 
            "1", 
            "1", 
            "1", 
            "1", 
            "1", 
            "1"
          ]
        }
      }, 
      "image/filename": {
        "bytesList": {
          "value": [
            "MzYtLUZvb3RiYWxsLzM2X0Zvb3RiYWxsX2FtZXJpY2FuZm9vdGJhbGxfYmFsbF8zNl8yNTcuanBn"
          ]
        }
      }, 
      "image/object/bbox/xmin": {
        "floatList": {
          "value": [
            0.0693359375, 
            0.4189453125, 
            0.28515625, 
            0.09765625, 
            0.6142578125, 
            0.84375, 
            0.4912109375, 
            0.3779296875, 
            0.76171875, 
            0.060546875
          ]
        }
      }, 
      "image/width": {
        "int64List": {
          "value": [
            "1024"
          ]
        }
      }
    }
  }
}

Print this json with :

for example in tf.python_io.tf_record_iterator("tfrecord"):
    print(MessageToJson(tf.train.Example.FromString(example)))

and here is wider_label_map.pbtxt :

item {
  id: 1
  name: "Face"
}

Hope this can help

ztwe commented 6 years ago

I encounter the same error when training openimages on ssd mobilenetv2. How to solve this problem?

GeneralLi95 commented 6 years ago

I encounter the same error when training my own data on ssd mobilenetv2. The log is same with you! Did you solve this problem?

pkulzc commented 6 years ago

I already sent out PR to fix this. Please try syncing to HEAD after the PR is merged.

yinguobing commented 6 years ago

Great to hear that! I'm pulling from the repo now, comming back later.

yinguobing commented 6 years ago

It's working. This issue is safe to close.

jpxrc commented 6 years ago

How can I get this latest change? Can I just re-download the tensorflow repo and use the updated object detection API without changing tensorflow installation?

GeneralLi95 commented 6 years ago

@junostar You don't have to reinstall TensorFlow. This is a new version of models rather than TensorFlow. So just download and replace the models if you need.

jpxrc commented 6 years ago

I'm still getting the same error even after using the latest tensorflow pull (see the trace below). Any ideas?

pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-08-08 22:22:41.064794: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding v
isible gpu devices: 0
2018-08-08 22:22:41.567691: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device in
terconnect StreamExecutor with strength 1 edge matrix:
2018-08-08 22:22:41.570874: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929]      0
2018-08-08 22:22:41.572346: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0:   N
2018-08-08 22:22:41.573874: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created
TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) -> physical GPU (device: 0, name:
 GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1322, in _do
_call
    return fn(*args)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1307, in _ru
n_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1409, in _ca
ll_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected size[0] in [0, 100], but got 109
         [[Node: Slice_31 = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](
unstack_1:31, zeros_128, stack_31)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "object_detection\model_main.py", line 101, in <module>
    tf.app.run()
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run
    _sys.exit(main(argv))
  File "object_detection\model_main.py", line 97, in main
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\training.py", line 439, in
train_and_evaluate
    executor.run()
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\training.py", line 518, in
run
    self.run_local()
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\training.py", line 650, in
run_local
    hooks=train_hooks)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 363, in
 train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 843, in
 _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 859, in
 _train_model_default
    saving_listeners)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1059, i
n _train_with_estimator_spec
    _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\training\monitored_session.py", line
567, in run
    run_metadata=run_metadata)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\training\monitored_session.py", line
1043, in run
    run_metadata=run_metadata)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\training\monitored_session.py", line
1134, in run
    raise six.reraise(*original_exc_info)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\six.py", line 693, in reraise
    raise value
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\training\monitored_session.py", line
1119, in run
    return self._sess.run(*args, **kwargs)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\training\monitored_session.py", line
1191, in run
    run_metadata=run_metadata)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\training\monitored_session.py", line
971, in run
    return self._sess.run(*args, **kwargs)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 900, in run
    run_metadata_ptr)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1135, in _ru
n
    feed_dict_tensor, options, run_metadata)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1316, in _do
_run
    run_metadata)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1335, in _do
_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected size[0] in [0, 100], but got 109
         [[Node: Slice_31 = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](
unstack_1:31, zeros_128, stack_31)]]

Caused by op 'Slice_31', defined at:
  File "object_detection\model_main.py", line 101, in <module>
    tf.app.run()
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run
    _sys.exit(main(argv))
  File "object_detection\model_main.py", line 97, in main
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\training.py", line 439, in
train_and_evaluate
    executor.run()
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\training.py", line 518, in
run
    self.run_local()
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\training.py", line 650, in
run_local
    hooks=train_hooks)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 363, in
 train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 843, in
 _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 856, in
 _train_model_default
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 831, in
 _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "C:\Users\JonPa\PycharmProjects\Tensorflow-GPU\models\research\object_detection\model_lib.py", line 216, in mod
el_fn
    unpad_groundtruth_tensors=train_config.unpad_groundtruth_tensors)
  File "C:\Users\JonPa\PycharmProjects\Tensorflow-GPU\models\research\object_detection\model_lib.py", line 163, in uns
tack_batch
    unpadded_tensor = tf.slice(padded_tensor, slice_begin, slice_size)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\ops\array_ops.py", line 650, in slice

    return gen_array_ops._slice(input_, begin, size, name=name)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 8584, in
_slice
    "Slice", input=input, begin=begin, size=size, name=name)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 78
7, in _apply_op_helper
    op_def=op_def)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\framework\ops.py", line 3392, in crea
te_op
    op_def=op_def)
  File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\framework\ops.py", line 1718, in __in
it__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Expected size[0] in [0, 100], but got 109
         [[Node: Slice_31 = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](
unstack_1:31, zeros_128, stack_31)]]
pkulzc commented 6 years ago

Not sure if you successfully synced to HEAD. I suggest you to do this and double check the commit you synced to.

jpxrc commented 6 years ago

@pkulzc It should be the latest commit. Here is the output from git:

>git checkout HEAD
M       research/object_detection/model_lib.py
Your branch is up to date with 'origin/master'.

>git merge upstream/master
Already up to date.
pkulzc commented 6 years ago

@junostar hmm, what's the output of 'git log'?

jpxrc commented 6 years ago

@pkulzc Here is the output:

tensorflow\models>git log
commit ad3526a98e7d5e9e57c029b8857ef7b15c903ca2 (HEAD -> master, upstream/master, origin/master, origin/HEAD)
Merge: 59f7e80a b6bc1a1d
Author: Mark Daoust <markdaoust@google.com>
Date:   Wed Aug 8 09:31:12 2018 -0700

    Merge pull request #5032 from HughKu/fix/#21199

    Fix a doc typo fashion-mnist example in #21199

commit b6bc1a1dd2ce93fcf017fc819135a2642ac6c3df
Author: Wei-Lin Ku <hughku@gmail.com>
Date:   Thu Aug 9 00:01:07 2018 +0800

    Fix Docs Typo Tensorflow#21199

commit 59f7e80ac8ad54913663a4b63ddf5a3db3689648
Author: pkulzc <lzc@google.com>
Date:   Tue Aug 7 19:00:27 2018 -0700

    Update object detection post processing and fixes boxes padding/clipping issue. (#5026)

    * Merged commit includes the following changes:
    207771702  by Zhichao Lu:

        Refactoring evaluation utilities so that it is easier to introduce new DetectionEvaluators with eval_metric_ops.
    --
    207758641  by Zhichao Lu:

:...skipping...
commit ad3526a98e7d5e9e57c029b8857ef7b15c903ca2 (HEAD -> master, upstream/master, origin/master, origin/HEAD)
Merge: 59f7e80a b6bc1a1d
Author: Mark Daoust <markdaoust@google.com>
Date:   Wed Aug 8 09:31:12 2018 -0700

    Merge pull request #5032 from HughKu/fix/#21199

    Fix a doc typo fashion-mnist example in #21199

commit b6bc1a1dd2ce93fcf017fc819135a2642ac6c3df
Author: Wei-Lin Ku <hughku@gmail.com>
Date:   Thu Aug 9 00:01:07 2018 +0800

    Fix Docs Typo Tensorflow#21199

commit 59f7e80ac8ad54913663a4b63ddf5a3db3689648
Author: pkulzc <lzc@google.com>
Date:   Tue Aug 7 19:00:27 2018 -0700

    Update object detection post processing and fixes boxes padding/clipping issue. (#5026)

    * Merged commit includes the following changes:
    207771702  by Zhichao Lu:

        Refactoring evaluation utilities so that it is easier to introduce new DetectionEvaluators with eval_metric_ops.

    --
    207758641  by Zhichao Lu:

        Require tensorflow version 1.9+ for running object detection API.

    --
    207641470  by Zhichao Lu:

        Clip `num_groundtruth_boxes` in pad_input_data_to_static_shapes() to `max_num_boxes`. This prevents a scenario where tensors are sliced to an invalid range in model_lib.unstack_batch().

    --
    207621728  by Zhichao Lu:

        This CL adds a FreezableBatchNorm that inherits from the Keras BatchNormalization layer, but supports freezing the `training` parameter at construction time instead of having to do it in the `call` method.

        It also adds a method to the `KerasLayerHyperparams` class that will build an appropriate FreezableBatchNorm layer according to the hyperparameter configuration. If batch_norm is disabled, this method returns and Identity layer.

        These will be used to simplify the conversion to Keras APIs.

    --
    207610524  by Zhichao Lu:

        Update anchor generators and box predictors for python3 compatibility.

    --
    207585122  by Zhichao Lu:

        Refactoring convolutional box predictor into separate prediction heads.
pkulzc commented 6 years ago

Yeah looks like you are using HEAD. We may need some more time to investigate.

Edit: please enlarge max_number_of_boxes to see how it works.

Walkerlikesfish commented 6 years ago

I am using WIDER-FACE dataset and experiencing the same issue after I pulled the latest master branch. I also tried to modify max_detections_per_class and max_total_detections in the pipeline config file, but it seems not working. Is it the pipeline file that I should modify ?

pkulzc commented 6 years ago

@Walkerlikesfish Could you please try use a larger max_number_of_boxes (in train config)?

MahdiEsf commented 4 years ago

I 'am having a similar issue with Python3 version of Tensorflow serving code: 'error': 'Expected size[1] in [0, 21], but got 25\n\t [[{{node FPN_slice_lvl4/narrow_to/Slice}}]]'

Seems like images with larger size have gotten this error but relatively-smaller-size images can pass the prediction with the right outcomes.