matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.65k stars 11.7k forks source link

InvalidArgumentError: slice index 33 of dimension 0 out of bounds at the time of prediction #817

Open janismdhanbad opened 6 years ago

janismdhanbad commented 6 years ago

I trained my model successfully and now I am trying to predict on images. I am specifying "IMAGES_PER_GPU" in InferenceConfig class. Till 32 "IMAGES_PER_GPU" there is no error but after that, it's giving me the error below. I checked some issues on tensorflow repository and I think it might be related to the input that is being passed. Also, the below error is giving an error in line 828 of utils.py

inputs_slice = [x[i] for x in inputs]

InvalidArgumentError Traceback (most recent call last) ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, args) 1321 try: -> 1322 return fn(args) 1323 except errors.OpError as e:

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata) 1306 return self._call_tf_sessionrun( -> 1307 options, feed_dict, fetch_list, target_list, run_metadata) 1308

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata) 1408 self._session, options, feed_dict, fetch_list, target_list, -> 1409 run_metadata) 1410 else:

InvalidArgumentError: slice index 33 of dimension 0 out of bounds. [[Node: ROI_1/strided_slice_228 = StridedSlice[Index=DT_INT32, T=DT_FLOAT, begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_input_anchors_1_0_0/_8681, mrcnn_detection_1/strided_slice_744/stack_1, mrcnn_detection_1/strided_slice_767/stack_1, mrcnn_mask_deconv_1/strided_slice/stack_1)]] [[Node: mrcnn_detection_1/map_24/while/Identity/_10586 = _Recv client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_19325_mrcnn_detection_1/map_24/while/Identity", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

InvalidArgumentError Traceback (most recent call last)

in () 4 for img_id in batch_id : 5 img_batch.append(skimage.io.imread(os.path.join(resizedImageDir, filenames[img_id]))) ----> 6 result = model.detect(img_batch) ~/crack_prediction/asphalt/bin/Mask_RCNN/mrcnn/model.py in detect(self, images, verbose) 2500 # Run object detection 2501 detections, _, _, mrcnn_mask, _, _, _ =\ -> 2502 self.keras_model.predict([molded_images, image_metas, anchors], verbose=0) 2503 # Process detections 2504 results = [] ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/engine/training.py in predict(self, x, batch_size, verbose, steps) 1833 f = self.predict_function 1834 return self._predict_loop(f, ins, batch_size=batch_size, -> 1835 verbose=verbose, steps=steps) 1836 1837 def train_on_batch(self, x, y, ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/engine/training.py in _predict_loop(self, f, ins, batch_size, verbose, steps) 1329 ins_batch[i] = ins_batch[i].toarray() 1330 -> 1331 batch_outs = f(ins_batch) 1332 if not isinstance(batch_outs, list): 1333 batch_outs = [batch_outs] ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py in __call__(self, inputs) 2480 session = get_session() 2481 updated = session.run(fetches=fetches, feed_dict=feed_dict, -> 2482 **self.session_kwargs) 2483 return updated[:len(self.outputs)] 2484 ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata) 898 try: 899 result = self._run(None, fetches, feed_dict, options_ptr, --> 900 run_metadata_ptr) 901 if run_metadata: 902 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr) ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata) 1133 if final_fetches or final_targets or (handle and feed_dict_tensor): 1134 results = self._do_run(handle, final_targets, final_fetches, -> 1135 feed_dict_tensor, options, run_metadata) 1136 else: 1137 results = [] ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata) 1314 if handle is None: 1315 return self._do_call(_run_fn, feeds, fetches, targets, options, -> 1316 run_metadata) 1317 else: 1318 return self._do_call(_prun_fn, handle, feeds, fetches) ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args) 1333 except KeyError: 1334 pass -> 1335 raise type(e)(node_def, op, message) 1336 1337 def _extend_graph(self): InvalidArgumentError: slice index 33 of dimension 0 out of bounds. [[Node: ROI_1/strided_slice_228 = StridedSlice[Index=DT_INT32, T=DT_FLOAT, begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_input_anchors_1_0_0/_8681, mrcnn_detection_1/strided_slice_744/stack_1, mrcnn_detection_1/strided_slice_767/stack_1, mrcnn_mask_deconv_1/strided_slice/stack_1)]] [[Node: mrcnn_detection_1/map_24/while/Identity/_10586 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_19325_mrcnn_detection_1/map_24/while/Identity", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_cloopmrcnn_detection_1/map_24/while/TensorArrayReadV3/_8401)]] Caused by op 'ROI_1/strided_slice_228', defined at: File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ipykernel/__main__.py", line 3, in app.launch_new_instance() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance app.start() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 486, in start self.io_loop.start() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tornado/platform/asyncio.py", line 132, in start self.asyncio_loop.run_forever() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/asyncio/base_events.py", line 422, in run_forever self._run_once() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/asyncio/base_events.py", line 1434, in _run_once handle._run() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/asyncio/events.py", line 145, in _run self._callback(*self._args) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tornado/ioloop.py", line 758, in _run_callback ret = callback() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tornado/stack_context.py", line 300, in null_wrapper return fn(*args, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 536, in self.io_loop.add_callback(lambda : self._handle_events(self.socket, 0)) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events self._handle_recv() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv self._run_callback(callback, msg) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback callback(*args, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tornado/stack_context.py", line 300, in null_wrapper return fn(*args, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher return self.dispatch_shell(stream, msg) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 233, in dispatch_shell handler(stream, idents, msg) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 399, in execute_request user_expressions, allow_stdin) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ipykernel/ipkernel.py", line 208, in do_execute res = shell.run_cell(code, store_history=store_history, silent=silent) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ipykernel/zmqshell.py", line 537, in run_cell return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2662, in run_cell raw_cell, store_history, silent, shell_futures) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2785, in _run_cell interactivity=interactivity, compiler=compiler, result=result) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2903, in run_ast_nodes if self.run_code(code, result): File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 10, in model_dir=MODEL_DIR) File "/home/ubuntu/crack_prediction/asphalt/bin/Mask_RCNN/mrcnn/model.py", line 1832, in __init__ self.keras_model = self.build(mode=mode, config=config) File "/home/ubuntu/crack_prediction/asphalt/bin/Mask_RCNN/mrcnn/model.py", line 1960, in build config=config)([rpn_class, rpn_bbox, anchors]) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/engine/topology.py", line 619, in __call__ output = self.call(inputs, **kwargs) File "/home/ubuntu/crack_prediction/asphalt/bin/Mask_RCNN/mrcnn/model.py", line 296, in call names=["pre_nms_anchors"]) File "/home/ubuntu/crack_prediction/asphalt/bin/Mask_RCNN/mrcnn/utils.py", line 828, in batch_slice inputs_slice = [x[i] for x in inputs] File "/home/ubuntu/crack_prediction/asphalt/bin/Mask_RCNN/mrcnn/utils.py", line 828, in inputs_slice = [x[i] for x in inputs] File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 523, in _slice_helper name=name) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 689, in strided_slice shrink_axis_mask=shrink_axis_mask) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 8232, in strided_slice name=name) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3414, in create_op op_def=op_def) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1740, in __init__ self._traceback = self._graph._extract_stack() # pylint: disable=protected-access InvalidArgumentError (see above for traceback): slice index 33 of dimension 0 out of bounds. [[Node: ROI_1/strided_slice_228 = StridedSlice[Index=DT_INT32, T=DT_FLOAT, begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_input_anchors_1_0_0/_8681, mrcnn_detection_1/strided_slice_744/stack_1, mrcnn_detection_1/strided_slice_767/stack_1, mrcnn_mask_deconv_1/strided_slice/stack_1)]] [[Node: mrcnn_detection_1/map_24/while/Identity/_10586 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_19325_mrcnn_detection_1/map_24/while/Identity", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_cloopmrcnn_detection_1/map_24/while/TensorArrayReadV3/_8401)]]
austinegri commented 6 years ago

I believe I am having an error like this as well.. I am running multi images on GPU and get a "slice index out of bounds" error

RyanLBWoods commented 5 years ago

I had the same "slice index out of bounds" error. When I change the IMAGE_PER_GPU to 8, the error was gone. I was trying to set the IMAGE_PER_GPU to 16. I have two machines, one is with two 12G TITAN X, the other is with four 12G GTX1080. On the TITAN machine, 16 images per GPU works fine, however, it has to be 8 on the GTX1080 machine. According to the numbers, I'm guessing there might be a limit of batch size, or a bug occurred if the batch size is bigger than 32. I didn't test more and I don't know why but I hope this may temporally help you solve the issue. Hope experts can solve it in a decent way.

jimmy15923 commented 5 years ago

Same problem! Can't inference when batch size is over 32!

codexponent commented 5 years ago

Change config.IMAGES_PER_GPU = 1 config.GPU_COUNT = 1 config.BATCH_SIZE = 1 while testing! :)

codexponent commented 5 years ago

@jimmy15923 , Please follow the above instructions.

recallrui commented 5 years ago

Edit ./mrcnn/model.py (About line 2524):

    detections, _, _, mrcnn_mask, _, _, _, *feature_maps =\
        self.keras_model.predict([molded_images, image_metas, anchors], verbose=0)
    results = []

to:

self.keras_model.predict([molded_images, image_metas, anchors], batch_size=len(images), verbose=0)