Closed yinguobing closed 6 years ago
I encountered the same issue after latest object detection update and also on wider face dataset. At beginning I doubted that caused by max_detections in config but nothing changed after update those.
update part of tfrecord I manually produce , this is one image with 10 bbox and one class :
{
"features": {
"feature": {
"image/object/bbox/xmax": {
"floatList": {
"value": [
0.142578125,
0.48046875,
0.3330078125,
0.1513671875,
0.6669921875,
0.9248046875,
0.525390625,
0.40625,
0.8046875,
0.0791015625
]
}
},
"image/height": {
"int64List": {
"value": [
"683"
]
}
},
"image/format": {
"bytesList": {
"value": [
"anBlZw=="
]
}
},
"image/encoded": {
"bytesList": {
"value": [
"........(skip the image encode)"
]
}
},
"image/object/bbox/ymax": {
"floatList": {
"value": [
0.46412885189056396,
0.4011712968349457,
0.3572474420070648,
0.3382137715816498,
0.2986822724342346,
0.4026354253292084,
0.37774524092674255,
0.34992679953575134,
0.3953147828578949,
0.3103953003883362
]
}
},
"image/object/class/text": {
"bytesList": {
"value": [
"RmFjZQ==",
"RmFjZQ==",
"RmFjZQ==",
"RmFjZQ==",
"RmFjZQ==",
"RmFjZQ==",
"RmFjZQ==",
"RmFjZQ==",
"RmFjZQ==",
"RmFjZQ=="
]
}
},
"image/object/bbox/ymin": {
"floatList": {
"value": [
0.3250366151332855,
0.28257685899734497,
0.26207906007766724,
0.2430453896522522,
0.19326500594615936,
0.22254759073257446,
0.3001464009284973,
0.2796486020088196,
0.3060029149055481,
0.2708638310432434
]
}
},
"image/onject/class/label": {
"int64List": {
"value": [
"1",
"1",
"1",
"1",
"1",
"1",
"1",
"1",
"1",
"1"
]
}
},
"image/filename": {
"bytesList": {
"value": [
"MzYtLUZvb3RiYWxsLzM2X0Zvb3RiYWxsX2FtZXJpY2FuZm9vdGJhbGxfYmFsbF8zNl8yNTcuanBn"
]
}
},
"image/object/bbox/xmin": {
"floatList": {
"value": [
0.0693359375,
0.4189453125,
0.28515625,
0.09765625,
0.6142578125,
0.84375,
0.4912109375,
0.3779296875,
0.76171875,
0.060546875
]
}
},
"image/width": {
"int64List": {
"value": [
"1024"
]
}
}
}
}
}
Print this json with :
for example in tf.python_io.tf_record_iterator("tfrecord"):
print(MessageToJson(tf.train.Example.FromString(example)))
and here is wider_label_map.pbtxt :
item {
id: 1
name: "Face"
}
Hope this can help
I encounter the same error when training openimages on ssd mobilenetv2. How to solve this problem?
I encounter the same error when training my own data on ssd mobilenetv2. The log is same with you! Did you solve this problem?
I already sent out PR to fix this. Please try syncing to HEAD after the PR is merged.
Great to hear that! I'm pulling from the repo now, comming back later.
It's working. This issue is safe to close.
How can I get this latest change? Can I just re-download the tensorflow repo and use the updated object detection API without changing tensorflow installation?
@junostar You don't have to reinstall TensorFlow. This is a new version of models rather than TensorFlow. So just download and replace the models if you need.
I'm still getting the same error even after using the latest tensorflow pull (see the trace below). Any ideas?
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-08-08 22:22:41.064794: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding v
isible gpu devices: 0
2018-08-08 22:22:41.567691: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device in
terconnect StreamExecutor with strength 1 edge matrix:
2018-08-08 22:22:41.570874: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929] 0
2018-08-08 22:22:41.572346: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N
2018-08-08 22:22:41.573874: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created
TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) -> physical GPU (device: 0, name:
GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
Traceback (most recent call last):
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1322, in _do
_call
return fn(*args)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1307, in _ru
n_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1409, in _ca
ll_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected size[0] in [0, 100], but got 109
[[Node: Slice_31 = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](
unstack_1:31, zeros_128, stack_31)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "object_detection\model_main.py", line 101, in <module>
tf.app.run()
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run
_sys.exit(main(argv))
File "object_detection\model_main.py", line 97, in main
tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\training.py", line 439, in
train_and_evaluate
executor.run()
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\training.py", line 518, in
run
self.run_local()
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\training.py", line 650, in
run_local
hooks=train_hooks)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 363, in
train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 843, in
_train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 859, in
_train_model_default
saving_listeners)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1059, i
n _train_with_estimator_spec
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\training\monitored_session.py", line
567, in run
run_metadata=run_metadata)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\training\monitored_session.py", line
1043, in run
run_metadata=run_metadata)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\training\monitored_session.py", line
1134, in run
raise six.reraise(*original_exc_info)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\six.py", line 693, in reraise
raise value
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\training\monitored_session.py", line
1119, in run
return self._sess.run(*args, **kwargs)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\training\monitored_session.py", line
1191, in run
run_metadata=run_metadata)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\training\monitored_session.py", line
971, in run
return self._sess.run(*args, **kwargs)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 900, in run
run_metadata_ptr)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1135, in _ru
n
feed_dict_tensor, options, run_metadata)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1316, in _do
_run
run_metadata)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1335, in _do
_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected size[0] in [0, 100], but got 109
[[Node: Slice_31 = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](
unstack_1:31, zeros_128, stack_31)]]
Caused by op 'Slice_31', defined at:
File "object_detection\model_main.py", line 101, in <module>
tf.app.run()
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run
_sys.exit(main(argv))
File "object_detection\model_main.py", line 97, in main
tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\training.py", line 439, in
train_and_evaluate
executor.run()
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\training.py", line 518, in
run
self.run_local()
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\training.py", line 650, in
run_local
hooks=train_hooks)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 363, in
train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 843, in
_train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 856, in
_train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\estimator\estimator.py", line 831, in
_call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "C:\Users\JonPa\PycharmProjects\Tensorflow-GPU\models\research\object_detection\model_lib.py", line 216, in mod
el_fn
unpad_groundtruth_tensors=train_config.unpad_groundtruth_tensors)
File "C:\Users\JonPa\PycharmProjects\Tensorflow-GPU\models\research\object_detection\model_lib.py", line 163, in uns
tack_batch
unpadded_tensor = tf.slice(padded_tensor, slice_begin, slice_size)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\ops\array_ops.py", line 650, in slice
return gen_array_ops._slice(input_, begin, size, name=name)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 8584, in
_slice
"Slice", input=input, begin=begin, size=size, name=name)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 78
7, in _apply_op_helper
op_def=op_def)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\framework\ops.py", line 3392, in crea
te_op
op_def=op_def)
File "C:\ProgramData\Anaconda2\envs\tf-gpu\lib\site-packages\tensorflow\python\framework\ops.py", line 1718, in __in
it__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Expected size[0] in [0, 100], but got 109
[[Node: Slice_31 = Slice[Index=DT_INT32, T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](
unstack_1:31, zeros_128, stack_31)]]
Not sure if you successfully synced to HEAD. I suggest you to do this and double check the commit you synced to.
@pkulzc It should be the latest commit. Here is the output from git:
>git checkout HEAD
M research/object_detection/model_lib.py
Your branch is up to date with 'origin/master'.
>git merge upstream/master
Already up to date.
@junostar hmm, what's the output of 'git log'?
@pkulzc Here is the output:
tensorflow\models>git log
commit ad3526a98e7d5e9e57c029b8857ef7b15c903ca2 (HEAD -> master, upstream/master, origin/master, origin/HEAD)
Merge: 59f7e80a b6bc1a1d
Author: Mark Daoust <markdaoust@google.com>
Date: Wed Aug 8 09:31:12 2018 -0700
Merge pull request #5032 from HughKu/fix/#21199
Fix a doc typo fashion-mnist example in #21199
commit b6bc1a1dd2ce93fcf017fc819135a2642ac6c3df
Author: Wei-Lin Ku <hughku@gmail.com>
Date: Thu Aug 9 00:01:07 2018 +0800
Fix Docs Typo Tensorflow#21199
commit 59f7e80ac8ad54913663a4b63ddf5a3db3689648
Author: pkulzc <lzc@google.com>
Date: Tue Aug 7 19:00:27 2018 -0700
Update object detection post processing and fixes boxes padding/clipping issue. (#5026)
* Merged commit includes the following changes:
207771702 by Zhichao Lu:
Refactoring evaluation utilities so that it is easier to introduce new DetectionEvaluators with eval_metric_ops.
--
207758641 by Zhichao Lu:
:...skipping...
commit ad3526a98e7d5e9e57c029b8857ef7b15c903ca2 (HEAD -> master, upstream/master, origin/master, origin/HEAD)
Merge: 59f7e80a b6bc1a1d
Author: Mark Daoust <markdaoust@google.com>
Date: Wed Aug 8 09:31:12 2018 -0700
Merge pull request #5032 from HughKu/fix/#21199
Fix a doc typo fashion-mnist example in #21199
commit b6bc1a1dd2ce93fcf017fc819135a2642ac6c3df
Author: Wei-Lin Ku <hughku@gmail.com>
Date: Thu Aug 9 00:01:07 2018 +0800
Fix Docs Typo Tensorflow#21199
commit 59f7e80ac8ad54913663a4b63ddf5a3db3689648
Author: pkulzc <lzc@google.com>
Date: Tue Aug 7 19:00:27 2018 -0700
Update object detection post processing and fixes boxes padding/clipping issue. (#5026)
* Merged commit includes the following changes:
207771702 by Zhichao Lu:
Refactoring evaluation utilities so that it is easier to introduce new DetectionEvaluators with eval_metric_ops.
--
207758641 by Zhichao Lu:
Require tensorflow version 1.9+ for running object detection API.
--
207641470 by Zhichao Lu:
Clip `num_groundtruth_boxes` in pad_input_data_to_static_shapes() to `max_num_boxes`. This prevents a scenario where tensors are sliced to an invalid range in model_lib.unstack_batch().
--
207621728 by Zhichao Lu:
This CL adds a FreezableBatchNorm that inherits from the Keras BatchNormalization layer, but supports freezing the `training` parameter at construction time instead of having to do it in the `call` method.
It also adds a method to the `KerasLayerHyperparams` class that will build an appropriate FreezableBatchNorm layer according to the hyperparameter configuration. If batch_norm is disabled, this method returns and Identity layer.
These will be used to simplify the conversion to Keras APIs.
--
207610524 by Zhichao Lu:
Update anchor generators and box predictors for python3 compatibility.
--
207585122 by Zhichao Lu:
Refactoring convolutional box predictor into separate prediction heads.
Yeah looks like you are using HEAD. We may need some more time to investigate.
Edit: please enlarge max_number_of_boxes to see how it works.
I am using WIDER-FACE dataset and experiencing the same issue after I pulled the latest master branch. I also tried to modify max_detections_per_class
and max_total_detections
in the pipeline config file, but it seems not working. Is it the pipeline file that I should modify ?
@Walkerlikesfish Could you please try use a larger max_number_of_boxes (in train config)?
I 'am having a similar issue with Python3 version of Tensorflow serving code: 'error': 'Expected size[1] in [0, 21], but got 25\n\t [[{{node FPN_slice_lvl4/narrow_to/Slice}}]]'
Seems like images with larger size have gotten this error but relatively-smaller-size images can pass the prediction with the right outcomes.
System information
You can collect some of this information using our environment capture script:
https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh
tf_env.txt
Describe the problem
I guess this is a bug. Using
legacy/train.py
the training process goes well, but failed withmodel_main.py
.The error message is like:
InvalidArgumentError (see above for traceback): Expected size[0] in [0, 100], but got 109
It's not always 109, the number varies in defferent running.
Source code / logs
The commit ID is: 02a9969e94feb51966f9bacddc1836d811f8ce69
Logs
Config
I also uploaded a TFRecord file here: https://drive.google.com/open?id=1NtNA1LefRYGSbRwTiahO_PrHfSjrR4mU