Qidian213 / deep_sort_yolov3

Real-time Multi-person tracker using YOLO v3 and deep_sort with tensorflow
GNU General Public License v3.0
1.65k stars 593 forks source link

gpu运行出错,大神解答一下 #127

Open huillll opened 5 years ago

huillll commented 5 years ago

Using TensorFlow backend. C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\utils\linearassignment.py:21: DeprecationWarning: The linearassignment module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead. DeprecationWarning) WARNING: Logging before flag parsing goes to stderr. W0906 11:46:44.912530 22940 deprecation_wrapper.py:119] From C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:190: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

W0906 11:46:44.913538 22940 deprecation_wrapper.py:119] From C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:197: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

W0906 11:46:44.914524 22940 deprecation_wrapper.py:119] From C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:203: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2019-09-06 11:46:44.927818: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2019-09-06 11:46:44.933615: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library nvcuda.dll 2019-09-06 11:46:45.025417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7085 pciBusID: 0000:01:00.0 2019-09-06 11:46:45.032457: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check. 2019-09-06 11:46:45.036569: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2019-09-06 11:46:45.657591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-09-06 11:46:45.661410: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 2019-09-06 11:46:45.663803: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N 2019-09-06 11:46:45.667818: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6386 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1) W0906 11:46:45.675489 22940 deprecation_wrapper.py:119] From C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:207: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

W0906 11:46:45.678480 22940 deprecation_wrapper.py:119] From C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:541: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0906 11:46:46.023557 22940 deprecation_wrapper.py:119] From C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:2041: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

W0906 11:46:49.791480 22940 deprecation_wrapper.py:119] From C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:2239: The name tf.image.resize_nearest_neighbor is deprecated. Please use tf.compat.v1.image.resize_nearest_neighbor instead.

model_data/yolo.h5 model, anchors, and classes loaded. W0906 11:46:54.066046 22940 deprecation.py:323] From C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\ops\array_ops.py:1354: add_dispatch_support..wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where 2019-09-06 11:46:56.871112: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7085 pciBusID: 0000:01:00.0 2019-09-06 11:46:56.876247: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check. 2019-09-06 11:46:56.879434: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2019-09-06 11:46:56.882695: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7085 pciBusID: 0000:01:00.0 2019-09-06 11:46:56.887427: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check. 2019-09-06 11:46:56.890983: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2019-09-06 11:46:56.893296: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-09-06 11:46:56.896007: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 2019-09-06 11:46:56.898004: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N 2019-09-06 11:46:56.900115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6386 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1) W0906 11:46:56.905462 22940 deprecation_wrapper.py:119] From E:\git_test\deep_sort_yolov3\tools\generate_detections.py:76: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

2019-09-06 11:46:58.206501: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] remapper failed: Invalid argument: Subshape must have computed start >= end since stride is negative, but is 0 and 2 (computed from start 0 and end 9223372036854775807 over shape with rank 2 and stride-1) 2019-09-06 11:46:58.468621: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] layout failed: Invalid argument: Subshape must have computed start >= end since stride is negative, but is 0 and 2 (computed from start 0 and end 9223372036854775807 over shape with rank 2 and stride-1) 2019-09-06 11:46:58.783956: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] remapper failed: Invalid argument: Subshape must have computed start >= end since stride is negative, but is 0 and 2 (computed from start 0 and end 9223372036854775807 over shape with rank 2 and stride-1) 2019-09-06 11:47:00.150272: E tensorflow/stream_executor/cuda/cuda_dnn.cc:319] Loaded runtime CuDNN library: 7.3.1 but source was compiled with: 7.4.1. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration. 2019-09-06 11:47:00.164328: E tensorflow/stream_executor/cuda/cuda_dnn.cc:319] Loaded runtime CuDNN library: 7.3.1 but source was compiled with: 7.4.1. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration. Traceback (most recent call last): File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1356, in _do_call return fn(*args) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1341, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1429, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found. (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[{{node conv2d_1/convolution}}]] [[concat_11/_3623]] (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[{{node conv2d_1/convolution}}]] 0 successful operations. 0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "demo.py", line 117, in main(YOLO()) File "demo.py", line 63, in main boxs = yolo.detect_image(image) File "E:\git_test\deep_sort_yolov3\yolo.py", line 95, in detect_image K.learning_phase(): 0 File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\client\session.py", line 950, in run run_metadata_ptr) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1173, in _run feed_dict_tensor, options, run_metadata) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_run run_metadata) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1370, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found. (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[node conv2d_1/convolution (defined at C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:3940) ]] [[concat_11/_3623]] (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[node conv2d_1/convolution (defined at C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:3940) ]] 0 successful operations. 0 derived errors ignored.

Errors may have originated from an input operation. Input Source operations connected to node conv2d_1/convolution: conv2d_1/kernel/read (defined at C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:422) input_1 (defined at C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:541)

Input Source operations connected to node conv2d_1/convolution: conv2d_1/kernel/read (defined at C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:422) input_1 (defined at C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py:541)

Original stack trace for 'conv2d_1/convolution': File "demo.py", line 117, in main(YOLO()) File "E:\git_test\deep_sort_yolov3\yolo.py", line 33, in init self.boxes, self.scores, self.classes = self.generate() File "E:\git_test\deep_sort_yolov3\yolo.py", line 54, in generate self.yolo_model = load_model(model_path, compile=False) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\engine\saving.py", line 458, in load_wrapper return load_function(*args, kwargs) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\engine\saving.py", line 550, in load_model model = _deserialize_model(h5dict, custom_objects, compile) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\engine\saving.py", line 243, in _deserialize_model model = model_from_config(model_config, custom_objects=custom_objects) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\engine\saving.py", line 593, in model_from_config return deserialize(config, custom_objects=custom_objects) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\layers__init__.py", line 168, in deserialize printable_module_name='layer') File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\utils\generic_utils.py", line 147, in deserialize_keras_object list(custom_objects.items()))) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\engine\network.py", line 1062, in from_config process_node(layer, node_data) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\engine\network.py", line 1012, in process_node layer(unpack_singleton(input_tensors), kwargs) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\engine\base_layer.py", line 451, in call output = self.call(inputs, *kwargs) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\layers\convolutional.py", line 171, in call dilation_rate=self.dilation_rate) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend\tensorflow_backend.py", line 3940, in conv2d data_format=tf_data_format) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 894, in convolution name=name) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 971, in convolution_internal name=name) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 1161, in conv2d data_format=data_format, dilations=dilations, name=name) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func return func(args, **kwargs) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op op_def=op_def) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in init self._traceback = tf_stack.extract_stack()

sunanlin13174 commented 4 years ago

很可能是GPU显存占满了,可以试试调小batch

CinderellaRobaker commented 4 years ago

When I want to test one image like

from yolo import YOLO
from PIL import Image
import imageio

yolo = YOLO()
image = imageio.imread('test.jpg')
frame = Image.fromarray(image[..., ::-1])  # bgr to rgb
boxs = yolo.detect_image(frame)  # error occurs

I encounter the same problem under the following environment: keras 2.3.1 tensorflow 1.15.0 cuda 10.0 python 3.7

RTX 2060 with 6G memory

Any suggestion or question will be helpful.

ZERO-SPACE-X commented 4 years ago

@huillll 你的这个问题解决了吗,可否告诉下是如何解决的 我遇到的问题和你一样

gzchenjiajun commented 4 years ago

@CinderellaRobaker Have you solved the problem yet? I have the same problem