FER not working with GPU

Hi, I was able to use FER on CPU, but cannot make it work on GPU. Based on this link (https://www.tensorflow.org/install/source#tested_build_configurations) I have checked my configuration, and it looks fine and supported: tensorflow 2.4.0 / python 3.8.10 / cuda 11.0.3 / cudnn 8.0.5 (I have tried other setups as well, but the results were even worse...)

When I try to run the example.py, the GPU device detected, the cuda libraries successfully opened, but after that I get the following errors: 2021-08-27 11:12:16.722491: E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED 2021-08-27 11:12:16.722689: W tensorflow/core/framework/op_kernel.cc:1763] OP_REQUIRES failed at conv_ops.cc:1106 : Not found: No algorithm worked! Traceback (most recent call last): File "test.py", line 11, in result = detector.detect_emotions(image) File "/usr/local/lib/python3.8/dist-packages/fer/fer.py", line 225, in detect_emotions face_rectangles = self.find_faces(img, bgr=True) File "/usr/local/lib/python3.8/dist-packages/fer/fer.py", line 182, in find_faces results = self._mtcnn.detect_faces(img) File "/usr/local/lib/python3.8/dist-packages/mtcnn/mtcnn.py", line 300, in detect_faces result = stage(img, result[0], result[1]) File "/usr/local/lib/python3.8/dist-packages/mtcnn/mtcnn.py", line 342, in stage1 out = self._pnet.predict(img_y) File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/training_v1.py", line 982, in predict return func.predict( File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/training_arrays_v1.py", line 706, in predict return predict_loop( File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/training_arrays_v1.py", line 384, in model_iteration batch_outs = f(ins_batch) File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/backend.py", line 3956, in call fetched = self._callable_fn(*array_vals, File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1480, in call__ ret = tf_session.TF_SessionRunCallable(self._session._session, tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found. (0) Not found: No algorithm worked! [[{{node conv2d/Conv2D}}]] (1) Not found: No algorithm worked! [[{{node conv2d/Conv2D}}]] [[conv2d_4/BiasAdd/_783]] 0 successful operations. 0 derived errors ignored.

Any suggestion what shall I do? (By the way, there is no difference if I install tensorflow==2.4.0 or tensorflow-gpu==2.4.0...)

Thanks!

Thanks for reporting the issue.

I suspect possible compatibility issues between the OS and CUDA/cudnn versions:

I ran the example.py under the env:

OS: Windows 10 Python : 3.6 TF:2.4 CUDA:10.2 with 11.0 dll Cudnn:8.2

The script ran without issues as seen below;

2021-08-28 14:07:48.781767: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll WARNING:tensorflow:From C:\Users\saran\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\compat\v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term 2021-08-28 14:08:42.383958: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library nvcuda.dll 2021-08-28 14:08:47.564227: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1060 with Max-Q Design computeCapability: 6.1 coreClock: 1.3415GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s 2021-08-28 14:08:47.602994: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll 2021-08-28 14:08:47.743832: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found 2021-08-28 14:08:47.801707: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found 2021-08-28 14:08:48.108700: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cufft64_10.dll 2021-08-28 14:08:48.163044: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library curand64_10.dll 2021-08-28 14:08:48.172109: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found 2021-08-28 14:08:48.180278: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found 2021-08-28 14:08:48.189189: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found 2021-08-28 14:08:48.195744: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2021-08-28 14:08:48.347281: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-08-28 14:08:48.421196: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-08-28 14:08:48.435452: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 2021-08-28 14:08:51.170786: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1060 with Max-Q Design computeCapability: 6.1 coreClock: 1.3415GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s 2021-08-28 14:08:51.183624: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2021-08-28 14:08:53.758128: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-08-28 14:08:53.765005: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0 2021-08-28 14:08:53.769173: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N WARNING:tensorflow:From C:\Users\saran\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\keras\layers\normalization.py:534: _colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. 28-08-2021:14:08:54,419 WARNING [deprecation.py:336] From C:\Users\saran\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\keras\layers\normalization.py:534: _colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. C:\Users\saran\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\keras\engine\training.py:2426: UserWarning:Model.state_updateswill be removed in a future version. This property should not be used in TensorFlow 2.0, asupdatesare applied automatically. warnings.warn('Model.state_updateswill be removed in a future version. ' [{'box': (83, 83, 200, 200), 'emotions': {'angry': 0.0, 'disgust': 0.0, 'fear': 0.0, 'happy': 0.97, 'sad': 0.0, 'surprise': 0.0, 'neutral': 0.03}}]

May I know your

OS
Do you have multiple versions of CUDA installed?

Please try to upgrade to Cudnn==8.2 and let us know if the error persists

It ran without issues as you said but does not appear to be using GPUs, which is the issue reported:

Could not load dynamic library 'cudnn64_8.dll';

On Sat 28. Aug 2021 at 14:36 Saranraj Nambusubramaniyan < @.***> wrote:

Thanks for reporting the issue.

I suspect possible compatibility issues between the OS and CUDA/cudnn versions:

I ran the example.py under the env:

OS: Windows 10 Python : 3.6 TF:2.4 CUDA:10.2 with 11.0 dll Cudnn:8.2

The script ran without issues as seen below;

2021-08-28 14:07:48.781767: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll WARNING:tensorflow:From C:\Users\saran\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\compat\v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term 2021-08-28 14:08:42.383958: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library nvcuda.dll 2021-08-28 14:08:47.564227: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1060 with Max-Q Design computeCapability: 6.1 coreClock: 1.3415GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s 2021-08-28 14:08:47.602994: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll 2021-08-28 14:08:47.743832: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found 2021-08-28 14:08:47.801707: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found 2021-08-28 14:08:48.108700: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cufft64_10.dll 2021-08-28 14:08:48.163044: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library curand64_10.dll 2021-08-28 14:08:48.172109: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found 2021-08-28 14:08:48.180278: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found 2021-08-28 14:08:48.189189: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found 2021-08-28 14:08:48.195744: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2021-08-28 14:08:48.347281: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-08-28 14:08:48.421196: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-08-28 14:08:48.435452: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 2021-08-28 14:08:51.170786: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1060 with Max-Q Design computeCapability: 6.1 coreClock: 1.3415GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s 2021-08-28 14:08:51.183624: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2021-08-28 14:08:53.758128: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-08-28 14:08:53.765005: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0 2021-08-28 14:08:53.769173: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N WARNING:tensorflow:From C:\Users\saran\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\keras\layers\normalization.py:534: _colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. 28-08-2021:14:08:54,419 WARNING [deprecation.py:336] From C:\Users\saran\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\keras\layers\normalization.py:534: _colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. C:\Users\saran\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\keras\engine\training.py:2426: UserWarning: Model.state_updateswill be removed in a future version. This property should not be used in TensorFlow 2.0, asupdates are applied automatically. warnings.warn('Model.state_updates will be removed in a future version. ' [{'box': (83, 83, 200, 200), 'emotions': {'angry': 0.0, 'disgust': 0.0, 'fear': 0.0, 'happy': 0.97, 'sad': 0.0, 'surprise': 0.0, 'neutral': 0.03}}]

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/justinshenk/fer/issues/30#issuecomment-907620713, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACOLMZG56ZADOG4XOY7X2FTT7DJ3BANCNFSM5C5EID6Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Dear @Saran-nns and @justinshenk,

My results were similar... If there is any problem with the GPU initialization (similar to your example, @Saran-nns), then the system falls back using the CPU, so everything works (or at least it looks like). But if the GPU initialization is OK, than after that there will be errors.

So I think the question is still pending...

Best regards, Csaba

Hi @kormoczi . Thanks for the update.

I updated CUDA and cudnn and found the example.py ran successfully with GPU.

Logs:

(tfgpu) N:\fer>python example.py 2021-09-02 17:16:47.723348: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll WARNING:tensorflow:From C:\Users\saran\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\compat\v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term 2021-09-02 17:18:06.955330: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library nvcuda.dll 2021-09-02 17:18:13.564614: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1060 with Max-Q Design computeCapability: 6.1 coreClock: 1.3415GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s 2021-09-02 17:18:13.576812: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll 2021-09-02 17:18:16.750779: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublas64_11.dll 2021-09-02 17:18:16.756477: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublasLt64_11.dll 2021-09-02 17:18:17.022092: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cufft64_10.dll 2021-09-02 17:18:17.790750: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library curand64_10.dll 2021-09-02 17:18:18.722597: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cusolver64_11.dll 2021-09-02 17:18:19.191502: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cusparse64_11.dll 2021-09-02 17:18:21.192585: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudnn64_8.dll 2021-09-02 17:18:22.231133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0 2021-09-02 17:18:23.184233: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-09-02 17:18:23.578463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1060 with Max-Q Design computeCapability: 6.1 coreClock: 1.3415GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s 2021-09-02 17:18:23.592416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0 2021-09-02 17:18:47.042383: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-09-02 17:18:47.071966: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0 2021-09-02 17:18:47.076344: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N 2021-09-02 17:18:47.330953: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4484 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1) 2021-09-02 17:18:49.850172: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1060 with Max-Q Design computeCapability: 6.1 coreClock: 1.3415GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s 2021-09-02 17:18:49.861020: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0 2021-09-02 17:18:49.865012: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-09-02 17:18:49.870792: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0 2021-09-02 17:18:49.873996: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N 2021-09-02 17:18:49.877882: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4484 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1) WARNING:tensorflow:From C:\Users\saran\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\keras\layers\normalization.py:534: _colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. 02-09-2021:17:18:50,945 WARNING [deprecation.py:336] From C:\Users\saran\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\keras\layers\normalization.py:534: _colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. C:\Users\saran\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow\python\keras\engine\training.py:2426: UserWarning:Model.state_updateswill be removed in a future version. This property should not be used in TensorFlow 2.0, asupdatesare applied automatically. warnings.warn('Model.state_updateswill be removed in a future version. ' 2021-09-02 17:18:56.204265: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudnn64_8.dll 2021-09-02 17:19:15.136577: I tensorflow/stream_executor/cuda/cuda_dnn.cc:359] Loaded cuDNN version 8202 2021-09-02 17:19:48.728937: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublas64_11.dll 2021-09-02 17:19:57.370451: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublasLt64_11.dll 2021-09-02 17:20:06.030969: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "GeForce GTX 1060 with Max-Q Design" frequency: 1341 num_cores: 10 environment { key: "architecture" value: "6.1" } environment { key: "cuda" value: "11020" } environment { key: "cudnn" value: "8100" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 1572864 shared_memory_size_per_multiprocessor: 98304 memory_size: 4702352179 bandwidth: 192192000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } } [{'box': (83, 83, 200, 200), 'emotions': {'angry': 0.0, 'disgust': 0.0, 'fear': 0.0, 'happy': 0.97, 'sad': 0.0, 'surprise': 0.0, 'neutral': 0.03}}]

From your log, it is clear that the CUDA couldn't reach the cudnn .dll files.

Please make sure that,

You have the right version of cudnn. I suggest cuda 11.x, cudnn 8.x
You have added cudnn in your system path
Copy and paste the dll files from CUDNN bin to CUDA bin as in user guide
Please remove any old versions of CUDA and cudnn from your system paths to avoid path conflicts
Restart your system

Hope this helps

@kormoczi Also, 5. Restart your system

Hi @Saran-nns,

Thanks for your suggestions, I will check them... But I have two questions:

You wrote this: "From your log, it is clear that the CUDA couldn't reach the cudnn .dll files." Which part of my log shows this?
You suggested to use cuda 11.x, cudnn 8.x, but as I have stated in the beginning, I am using cuda 11.0.3 / cudnn 8.0.5 already, so this should be ok... No?

Best regards

Hi @Saran-nns,

I have checked the project again, and to my very big surprise, after I have re-built the docker image (without any modification), right now the example is working without any problem! I can't tell yet, what has changed, but most probably not the FER library and not the CUDA/CuDNN...

Best regards

@kormoczi Great that it works through docker. cuda 11.x is not packaged with cublas. cudnn provides this functionality for ml frameworks like tf, pytorch or keras to generate (initialize) any cublas handles. Even you have the right versions installed, the error could still throw if their (cuda and cudnn) paths(including the python environment) are not well defined.

Thanks for the issue again and hope you enjoy ferr'ing :)

JustinShenk / fer

FER not working with GPU #30