natethegreate / hent-AI

Automation of censor bar detection
MIT License
1.58k stars 145 forks source link

CUDNN_STATUS_INTERNAL_ERROR on 2nd Phase of Video Processing #14

Open thrwayDrk opened 3 years ago

thrwayDrk commented 3 years ago

Using the proper GPU (as recommended) on google colab, I get the following error every time:

Video read complete. Starting video phase 2: detection + splice
frame:  0 / 7032.0
2020-09-04 20:33:18.236907: E tensorflow/stream_executor/cuda/cuda_dnn.cc:332] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

I looked into the error on StackOverflow but everyone says there might be a mismatch between CuDNN and Cuda versions. I tried upgrading the cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb to its 10.1 equivalent but hent-ai is hooked on the 9.0 version due to its references it appears.

Is there a better fix for CUDNN_STATUS_INTERNAL_ERROR?

thrwayDrk commented 3 years ago

Following the collab instructions directly on a Tesla P100-PCIE

/usr/local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /usr/local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /usr/local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /usr/local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /usr/local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /usr/local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Using TensorFlow backend. Weights: weights.h5 Dataset: None Logs: /logs Starting inference 2020-10-10 18:33:52.275696: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA 2020-10-10 18:33:52.430784: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:897] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-10-10 18:33:52.431371: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties: name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285 pciBusID: 0000:00:04.0 totalMemory: 15.90GiB freeMemory: 15.64GiB 2020-10-10 18:33:52.431403: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0 2020-10-10 18:33:52.839807: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-10-10 18:33:52.839858: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958] 0 2020-10-10 18:33:52.839870: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: N 2020-10-10 18:33:52.839977: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/device:GPU:0 with 15159 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0) CUDA-compatible GPU located! Model warmup complete Detected fps: 24.0 Video read complete. Starting video phase 1 : resize + GAN frame: 0 / 2019.0 /usr/local/lib/python3.5/site-packages/torch/nn/modules/upsampling.py:122: UserWarning: nn.Upsampling is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.Upsampling is deprecated. Use nn.functional.interpolate instead.") frame: 1 / 2019.0 frame: 2 / 2019.0 frame: 3 / 2019.0 frame: 4 / 2019.0 frame: 5 / 2019.0 frame: 6 / 2019.0 frame: 7 / 2019.0 frame: 8 / 2019.0 frame: 9 / 2019.0 frame: 10 / 2019.0 frame: 11 / 2019.0 frame: 12 / 2019.0 frame: 13 / 2019.0 frame: 14 / 2019.0 frame: 15 / 2019.0 frame: 16 / 2019.0 frame: 17 / 2019.0 frame: 18 / 2019.0 frame: 19 / 2019.0 frame: 20 / 2019.0 frame: 21 / 2019.0 frame: 22 / 2019.0 .... frame: 2019 / 2019.0 Video: Phase 1 complete! Detected fps: 30.0 Video read complete. Starting video phase 1 : resize + GAN frame: 0 / 98.0 Granularity was less than threshold at 9 frame: 1 / 98.0 Granularity was less than threshold at 9 frame: 2 / 98.0 Granularity was less than threshold at 9 frame: 3 / 98.0 Granularity was less than threshold at 9 frame: 4 / 98.0 Granularity was less than threshold at 9 frame: 5 / 98.0 Granularity was less than threshold at 9 frame: 6 / 98.0 frame: 7 / 98.0 frame: 8 / 98.0 frame: 9 / 98.0 frame: 10 / 98.0 frame: 11 / 98.0 .... frame: 97 / 98.0 frame: 98 / 98.0 Video: Phase 1 complete! 2020-10-10 19:16:32.370362: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0 2020-10-10 19:16:32.370521: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-10-10 19:16:32.370539: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958] 0 2020-10-10 19:16:32.370549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: N 2020-10-10 19:16:32.370655: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15159 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0) Creating model, Loading weights... Weights loaded Detected fps: 0.0 Video read complete. Starting video phase 2: detection + splice frame: 0 / 0.0 Video: Phase 2 complete! Attempting to create a copy with audio included... ffmpeg version 3.4.8-0ubuntu0.2 Copyright (c) 2000-2020 the FFmpeg developers built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04) configuration: --prefix=/usr --extra-version=0ubuntu0.2 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared libavutil 55. 78.100 / 55. 78.100 libavcodec 57.107.100 / 57.107.100 libavformat 57. 83.100 / 57. 83.100 libavdevice 57. 10.100 / 57. 10.100 libavfilter 6.107.100 / 6.107.100 libavresample 3. 7. 0 / 3. 7. 0 libswscale 4. 8.100 / 4. 8.100 libswresample 2. 9.100 / 2. 9.100 libpostproc 54. 7.100 / 54. 7.100 /content/drive/My Drive/hent-AI/videos//Timeline 1_decensored.mp4: No such file or directory ERROR in ESRGAN: audio rip. Ensure ffmpeg.exe is in the main directory. ffmpeg error (see stderr output for detail) Detected fps: 30.0 Video read complete. Starting video phase 2: detection + splice frame: 0 / 98.0 2020-10-10 19:16:44.305466: E tensorflow/stream_executor/cuda/cuda_dnn.cc:332] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

thrwayDrk commented 3 years ago

Appears to fail on line 468 of detector.py.

vcapture returns false right away. Maybe the video path is wrong?

ethanfel commented 3 years ago

i have the exact same issue, did you found the fix ?

10882 commented 3 years ago

When I start ERSGAN, I have thish error. DCP working normal.

CUDA-compatible GPU located!
Starting ESRGAN detection and decensor
Model warmup complete
Detected fps: 30.01919385796545
Video read complete. Starting video phase 1 : resize + GAN
frame:  0 / 1031.0
Granularity was less than threshold at  8
Exception in Tkinter callback
Traceback (most recent call last):
  File "D:\conda\envs\hentai\lib\tkinter\__init__.py", line 1550, in __call__
    return self.func(*args)
  File "main.py", line 331, in <lambda>
    go_button = Button(mos_win, text="Go!", command = lambda: hentAI_TGAN(in_path=o_entry.get(), is_video=True))
  File "main.py", line 184, in hentAI_TGAN
    detect_instance.run_ESRGAN(in_path = in_path, is_video = is_video, force_jpg = force_jpg)
  File "C:\Users\10882\Desktop\decencur\hent-AI-master\detector.py", line 379, in run_ESRGAN
    self.resize_GAN(img_path=img_path, img_name=img_name, is_video=is_video)
  File "C:\Users\10882\Desktop\decencur\hent-AI-master\detector.py", line 256, in resize_GAN
    self.esrgan_instance.run_esrgan(test_img_folder=file_name, out_filename=gan_img_path, mosaic_res=granularity)
  File "C:\Users\10882\Desktop\decencur\hent-AI-master\ColabESRGAN\test.py", line 49, in run_esrgan
    output = self.model(img_LR).data.squeeze().float().cpu().clamp_(0, 1).numpy()
  File "D:\conda\envs\hentai\lib\site-packages\torch\nn\modules\module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\10882\Desktop\decencur\hent-AI-master\ColabESRGAN\architecture.py", line 37, in forward
    x = self.model(x)
  File "D:\conda\envs\hentai\lib\site-packages\torch\nn\modules\module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\conda\envs\hentai\lib\site-packages\torch\nn\modules\container.py", line 91, in forward
    input = module(input)
  File "D:\conda\envs\hentai\lib\site-packages\torch\nn\modules\module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\conda\envs\hentai\lib\site-packages\torch\nn\modules\conv.py", line 301, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: CuDNN error: CUDNN_STATUS_INTERNAL_ERROR