hunglc007 / tensorflow-yolov4-tflite

YOLOv4, YOLOv4-tiny, YOLOv3, YOLOv3-tiny Implemented in Tensorflow 2.0, Android. Convert YOLO v4 .weights tensorflow, tensorrt and tflite
https://github.com/hunglc007/tensorflow-yolov4-tflite
MIT License
2.23k stars 1.24k forks source link

Status: device kernel image is invalid #452

Closed Raunak-Singh-Inventor closed 2 years ago

Raunak-Singh-Inventor commented 2 years ago

When I try to convert the DarkNet YOLOv4 weights to TensorFlow Model using this command: python save_model.py --weights ./data/yolov4.weights --output ./checkpoints/yolov4-416 --input_size 416 --model yolov4

I get the error: tensorflow.python.framework.errors_impl.InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: device kernel image is invalid

TensorFlow version: 2.3.0 GPU: Nvidia RTX 3080 (Laptop version) CUDA version: 10.1 cuDNN version: 7.6.0 OpenCV version: 4.6.0 Python version: 3.6 Operating System: Pop OS (Ubuntu)

Complete logs:

2022-06-22 20:37:27.347411: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2022-06-22 20:37:27.997794: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2022-06-22 20:37:28.024021: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-22 20:37:28.024126: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3080 Laptop GPU computeCapability: 8.6
coreClock: 1.71GHz coreCount: 48 deviceMemorySize: 15.75GiB deviceMemoryBandwidth: 417.29GiB/s
2022-06-22 20:37:28.024141: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2022-06-22 20:37:28.025076: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2022-06-22 20:37:28.026001: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2022-06-22 20:37:28.026153: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2022-06-22 20:37:28.027155: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2022-06-22 20:37:28.027713: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2022-06-22 20:37:28.029777: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2022-06-22 20:37:28.029869: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-22 20:37:28.029980: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-22 20:37:28.030043: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2022-06-22 20:37:28.030261: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-22 20:37:28.054709: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2299965000 Hz
2022-06-22 20:37:28.057396: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55bfa0419570 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-06-22 20:37:28.057469: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2022-06-22 20:37:28.109929: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-22 20:37:28.110166: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55bfa20ee5a0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2022-06-22 20:37:28.110179: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 3080 Laptop GPU, Compute Capability 8.6
2022-06-22 20:37:28.110445: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-22 20:37:28.110601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3080 Laptop GPU computeCapability: 8.6
coreClock: 1.71GHz coreCount: 48 deviceMemorySize: 15.75GiB deviceMemoryBandwidth: 417.29GiB/s
2022-06-22 20:37:28.110638: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2022-06-22 20:37:28.110682: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2022-06-22 20:37:28.110690: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2022-06-22 20:37:28.110714: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2022-06-22 20:37:28.110721: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2022-06-22 20:37:28.110728: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2022-06-22 20:37:28.110736: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2022-06-22 20:37:28.110789: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-22 20:37:28.110929: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-22 20:37:28.111027: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2022-06-22 20:37:28.111066: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
Traceback (most recent call last):
  File "save_model.py", line 4, in <module>
    from core.yolov4 import YOLO, decode, filter_boxes
  File "/home/rauna/Documents/tensorflow-yolov4-tflite/core/yolov4.py", line 292, in <module>
    def filter_boxes(box_xywh, scores, score_threshold=0.4, input_shape = tf.constant([416,416])):
  File "/home/rauna/anaconda3/envs/tf2.3/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 264, in constant
    allow_broadcast=True)
  File "/home/rauna/anaconda3/envs/tf2.3/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 275, in _constant_impl
    return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
  File "/home/rauna/anaconda3/envs/tf2.3/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 300, in _constant_eager_impl
    t = convert_to_eager_tensor(value, ctx, dtype)
  File "/home/rauna/anaconda3/envs/tf2.3/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 97, in convert_to_eager_tensor
    ctx.ensure_initialized()
  File "/home/rauna/anaconda3/envs/tf2.3/lib/python3.6/site-packages/tensorflow/python/eager/context.py", line 539, in ensure_initialized
    context_handle = pywrap_tfe.TFE_NewContext(opts)
tensorflow.python.framework.errors_impl.InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: device kernel image is invalid
Raunak-Singh-Inventor commented 2 years ago

was able to fix by using TensorFlow 2.5