matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.7k stars 11.71k forks source link

Error while training from COCO #1407

Open AndreaPi opened 5 years ago

AndreaPi commented 5 years ago

I'm running Mask R-CNN in a Docker container with 6 Gbs of RAM (I can increase it if needed). I'm training on COCO with

python samples/coco/coco.py train --dataset=/COCO/ --model=mask_rcnn_coco.h5

and I get the following error in Epoch 1

2019-04-03 17:25:02.914326: W ./tensorflow/core/grappler/optimizers/graph_optimizer_stage.h:241] Failed to run optimizer ArithmeticOptimizer, stage RemoveStackStridedSliceSameAxis node proposal_targets/strided_slice. Error: ValidateStridedSliceOp returned partial shapes [1,?,?] and [?,?]
2019-04-03 17:25:03.174141: W ./tensorflow/core/grappler/optimizers/graph_optimizer_stage.h:241] Failed to run optimizer ArithmeticOptimizer, stage RemoveStackStridedSliceSameAxis node proposal_targets/strided_slice_37. Error: ValidateStridedSliceOp returned partial shapes [1,?,?] and [?,?]
2019-04-03 17:27:53.164167: W ./tensorflow/core/grappler/optimizers/graph_optimizer_stage.h:241] Failed to run optimizer ArithmeticOptimizer, stage RemoveStackStridedSliceSameAxis node proposal_targets/strided_slice. Error: ValidateStridedSliceOp returned partial shapes [1,?,?] and [?,?]
2019-04-03 17:27:53.198986: W ./tensorflow/core/grappler/optimizers/graph_optimizer_stage.h:241] Failed to run optimizer ArithmeticOptimizer, stage RemoveStackStridedSliceSameAxis node proposal_targets/strided_slice_37. Error: ValidateStridedSliceOp returned partial shapes [1,?,?] and [?,?]
Killed

There seems to be an issue with a Tensorflow optimizer. My configuration:

Linux version 4.9.125-linuxkit (root@659b6d51c354) (gcc version 6.4.0 (Alpine 6.4.0) ) #1 SMP Fri Sep 7 08:20:28 UTC 2018
Python 3.7.2
tensorflow.__version__: '1.13.1'
keras.__version__ : '2.2.4'
numpy.__version__: '1.16.2'
scipy.__version__: '1.2.1'
PIL.__version__: '6.0.0'
cython.__version__: '0.29.6'
matplotlib.__version__: '3.0.3'
skimage.__version__: '0.15.0'
cv2.__version__: '4.0.0'
h5py.__version__: '2.9.0'
imgaug.__version__: '0.2.8'
IPython.__version__: '7.4.0'

Can you help? Thanks

PS before the fatal error, I also get some warnings

WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/gradients_impl.py:110: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/usr/local/lib/python3.7/site-packages/keras/engine/training_generator.py:47: UserWarning: Using a generator with `use_multiprocessing=True` and multiple workers may duplicate your data. Please consider using the`keras.utils.Sequence class.
  UserWarning('Using a generator with `use_multiprocessing=True`'
jiunbae commented 5 years ago

maybe duplicated. same issue.

MikhailSam 's answer works for me!

AndreaPi commented 5 years ago

I'll go have a look, and if it's a duplicate I'll close this issue. Thanks @MaybeS !