Open akrea opened 2 years ago
So I will wait till the docker nextcloud version provides the dlib-package by default. It should happen as soon as dlib is out of its "infant" state.
Still I would prefer to use cuda cores on the external model.
cheers
I was able to get the external model with cuda running on ubuntu, however I didn't document how I did it.
Since I want to get rid of my ubuntu setup I might try to get it running in a docker container, however I have 0 experience with this, so this will take quite some time!
After much testing I managed to make a docker image which works for me in docker: https://github.com/mrbrdo/facerecognition-external-model-cuda
I am using nvidia driver version 535 on Ubuntu 22.04 with nvidia-container-toolkit (see nvidia install docs)
To use it (port 8090):
cd facerecognition-external-model-cuda
docker build --tag 'facerecognition-external-cuda' .
sudo docker run --rm --runtime=nvidia --gpus all -i -p 8090:5000 -v /PATH/TO/YOUR/api.key:/app/api.key -e FACE_MODEL=4 --name facerecognition-external-cuda facerecognition-external-cuda
Hi Jan, Matias
As I have a huge image collection and a RTX card, I jumped to try this.
I was able to build and run the cuda image but when I try to use it with face:background_job, I get this error:
CNN_DETECTOR = dlib.cnn_face_detection_model_v1(DETECTOR_PATH)
RuntimeError: Error while calling cudaGetDevice(&the_device_id) in file
/tmp/pip-wheel-js0yp6db/dlib_666e8ae1fada4f51b17d0e199472f5a8/dlib/cuda/gpu_data.cpp:204.
code: 500, reason: named symbol not found
Could it be that my installed driver is newer?
This is what I get with nvidia-smi on your container:
# nvidia-smi
Sun May 26 18:04:20 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.00 Driver Version: 560.38 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 2080 ... On | 00000000:01:00.0 On | N/A |
| 0% 41C P2 53W / 250W | 3818MiB / 8192MiB | 4% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
thanks a lot
On Sat, May 25, 2024 at 9:34 PM Jan Berdajs ***@***.***>
wrote:
> After much testing I managed to make a docker image which works for me in
> docker:
> https://github.com/mrbrdo/facerecognition-external-model-cuda
>
> I am using nvidia driver version 535 on Ubuntu 22.04 with
> nvidia-container-toolkit (see nvidia install docs
> <https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html>
> )
>
> —
> Reply to this email directly, view it on GitHub
> <https://github.com/matiasdelellis/facerecognition-external-model/issues/6#issuecomment-2131800335>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAIXVWMT7HSBB2LRQJ4GLS3ZEEUXVAVCNFSM5ZFKKYW2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJTGE4DAMBTGM2Q>
> .
> You are receiving this because you are subscribed to this thread.Message
> ID: <matiasdelellis/facerecognition-external-model/issues/6/2131800335@
> github.com>
>
Well to add more information, I was able to rebuild the image with
FROM nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04 AS builder
There is no version mismatch anymore, but getting the same error.
Will keep troubleshooting this
regards
@guillebot I'm not sure why you have that issue. Perhaps try with the driver version 535? Is the nvidia-smi output from the host or the container? I don't think the CUDA version should matter, as long as it is not higher than what nvidia-smi says (that's the max supported version). You could even try with CUDA 11.
Hi
I could not find this info anywhere in the specs: Does the docker-version of the external-model support GPU usage?
If yes: great!!
If no: would it be possible to provide a dockerfile that includes this option?
Background: I do have a perfectly smooth running NC inculding easy update-processes and custom configuration. So I would much prefer not to build an own NC image as I suspect a high probablity of having issues in bringing my current set-up back to life. Another unknown are update/-gradability issues of the self-compiled image. To add the external-model on the same host seems a much safe route to take.
Thanks for clarification! akrea