object_detection_tutorial error: Blas SGEMM launch failed

VSustarAA commented 4 years ago

Hello everyone I have installed all the dependencies with tensorflow_gpu 1.5 and cuda 9.0 following the instructions (on Win10) and I have NVIDIA RTX 2060 6Gb

When I am running object_detection_tutorial.py in the last cell I get the error message: Blas SGEMM launch failed

I suspect it might be due to too little RAM on my graphics card. I´ve read one can change the amount of used memory via: " gpu_options = tf.GPUOptions(allow_growth=True) gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.3) with tf.Session(graph=detection_graph, config=tf.ConfigProto(gpu_options=gpu_options,allow_soft_placement=True, log_device_placement=True)) as sess:" but I am not sure I´ve put it to the right place - it does not have any effect

I was able to make mask r cnn and yolact work on my ubuntu boot....

Would anyone know how to solve the issue?

Here´s the error message:

####################################################################################################################################################

InternalError Traceback (most recent call last) c:\programdata\anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, args) 1349 try: -> 1350 return fn(args) 1351 except errors.OpError as e:

c:\programdata\anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\client\session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata) 1328 feed_dict, fetch_list, target_list, -> 1329 status, run_metadata) 1330

c:\programdata\anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\framework\errors_impl.py in exit(self, type_arg, value_arg, traceback_arg) 472 compat.as_text(c_api.TF_Message(self.status.status)), --> 473 c_api.TF_GetCode(self.status.status)) 474 # Delete the underlying status object from memory otherwise it stays alive

InternalError: Blas SGEMM launch failed : m=22500, n=64, k=32 [[Node: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/BatchNorm/batchnorm/mul_1 = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6, FeatureExtractor/MobilenetV1/Conv2d_1_pointwise/weights/read/_98cf101)]] [[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_82/Gather/Gather_2/_509 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_5277_Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/ClipToWindow_82/Gather/Gather_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

InternalError Traceback (most recent call last)

in 23 (boxes, scores, classes, num) = sess.run( 24 [detection_boxes, detection_scores, detection_classes, num_detections], ---> 25 feed_dict={image_tensor: image_np_expanded}) 26 # Visualization of the results of a detection. 27 vis_util.visualize_boxes_and_labels_on_image_array( c:\programdata\anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata) 893 try: 894 result = self._run(None, fetches, feed_dict, options_ptr, --> 895 run_metadata_ptr) 896 if run_metadata: 897 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr) c:\programdata\anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata) 1126 if final_fetches or final_targets or (handle and feed_dict_tensor): 1127 results = self._do_run(handle, final_targets, final_fetches, -> 1128 feed_dict_tensor, options, run_metadata) 1129 else: 1130 results = [] c:\programdata\anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata) 1342 if handle is None: 1343 return self._do_call(_run_fn, self._session, feeds, fetches, targets, -> 1344 options, run_metadata) 1345 else: 1346 return self._do_call(_prun_fn, self._session, handle, feeds, fetches) c:\programdata\anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args) 1361 except KeyError: 1362 pass -> 1363 raise type(e)(node_def, op, message) 1364 1365 def _extend_graph(self): InternalError: Blas SGEMM launch failed : m=22500, n=64, k=32 [[Node: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/BatchNorm/batchnorm/mul_1 = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6, FeatureExtractor/MobilenetV1/Conv2d_1_pointwise/weights/read/_98__cf__101)]] ...

HrabaThor commented 3 years ago

Hi, I'm currently having the same issue while running model_main_tf2.py from object detection API, did you solve this issue?

HrabaThor commented 3 years ago

Ok, so apparently, posting a question about your problem helps :) The problem was tensorflow wanted to allocate all GPU memory at once (didnt have this issue with lower power GPUs GeForce GTX 1060 Ti m), so when launching a script, add right after importing tensorflow this piece of code, which tell tf to allocate memory gradually

This is my first comment on GitHub, so I'm sorry for bad code formatting, don't know what's going on. but hopefully, you can understand those pieces of python code :)

import tensorflow as tf gpus = tf.config.experimental.list_physical_devices('GPU') tf.config.experimental.set_memory_growth(gpus[0], True)

this works when you have single GPU, when you use more of them, you have to iterate over them and set it individually as:

import tensorflow as tf for gpu in tf.config.experimental.list_physical_devices('GPU'): tf.config.experimental.set_memory_growth(gpu, True)

EdjeElectronics / TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10

object_detection_tutorial error: Blas SGEMM launch failed #473

####################################################################################################################################################