exadel-inc / CompreFace

Leading free and open-source face recognition system
https://exadel.com/accelerator-showcase/compreface/
Apache License 2.0
5.71k stars 775 forks source link

DAMN ! worker 1 (pid: 38) died, killed by signal 9 :( trying respawn ... #854

Open chrisborell opened 2 years ago

chrisborell commented 2 years ago

Describe the bug

Trying to run Mobilenet-gpu - I upload test image and then it fails in logs with "DAMN ! worker 1 (pid: 38) died, killed by signal 9 :( trying respawn ..." Compreface freezes and reboots, endless loop when i try again.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information): GPU: NVIDIA GeForce RTX 3050 GPU Memory: 4.0 CPU: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GhZ Ram: 8.0GB

NVIDIA-SMI 510.73.05 Driver Version: 516.93 CUDA Version: 11.7 |

Additional context Add any other context about the problem here.

compreface-ui | 172.18.0.1 - - [30/Jul/2022:03:56:43 +0000] "GET /admin/user/demo/model HTTP/1.1" 200 52 "http://localhost:8000/login?redirect=%2F" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36" compreface-core | [03:56:49] ../src/nnvm/legacy_json_util.cc:208: Loading symbol saved by previous version v1.2.0. Attempting to upgrade... compreface-core | [03:56:49] ../src/nnvm/legacy_json_util.cc:216: Symbol successfully upgraded! compreface-core | [03:56:49] ../src/engine/engine.cc:54: MXNet start using engine: ThreadedEnginePerDevice compreface-core | [03:56:49] ../src/base.cc:79: cuDNN lib mismatch: linked-against version 8101 != compiled-against version 8100.  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning. compreface-core | DAMN ! worker 1 (pid: 38) died, killed by signal 9 :( trying respawn ... compreface-core | Respawned uWSGI worker 1 (new pid: 59) compreface-core | [03:57:06] ../src/nnvm/legacy_json_util.cc:208: Loading symbol saved by previous version v1.2.0. Attempting to upgrade... compreface-core | [03:57:06] ../src/nnvm/legacy_json_util.cc:216: Symbol successfully upgraded! compreface-core | [03:57:06] ../src/engine/engine.cc:54: MXNet start using engine: ThreadedEnginePerDevice compreface-core | [03:57:06] ../src/base.cc:79: cuDNN lib mismatch: linked-against version 8101 != compiled-against version 8100.  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning. `` Logs.txt

pospielov commented 2 years ago

Sorry for the late response. I was looking for a way how to reproduce it. It seems MXnet can't load the model, but it's unclear why. It also is not reproducible on RTX 3060. Any chance that your GPU is busy with something else?