Does docker-version support GPU

akrea commented 2 years ago

Hi

I could not find this info anywhere in the specs: Does the docker-version of the external-model support GPU usage?

If yes: great!!

If no: would it be possible to provide a dockerfile that includes this option?

Background: I do have a perfectly smooth running NC inculding easy update-processes and custom configuration. So I would much prefer not to build an own NC image as I suspect a high probablity of having issues in bringing my current set-up back to life. Another unknown are update/-gradability issues of the self-compiled image. To add the external-model on the same host seems a much safe route to take.

Thanks for clarification! akrea

akrea commented 2 years ago

OK - what I found out: the external model does not work with cuda cores.
In addition, to use the external version one still needs the dlib package on nextcloud installed in order to load the face recognition app.

So I will wait till the docker nextcloud version provides the dlib-package by default. It should happen as soon as dlib is out of its "infant" state.

Still I would prefer to use cuda cores on the external model.

cheers

KoMa1012 commented 4 months ago

I was able to get the external model with cuda running on ubuntu, however I didn't document how I did it.

Since I want to get rid of my ubuntu setup I might try to get it running in a docker container, however I have 0 experience with this, so this will take quite some time!

mrbrdo commented 1 month ago

After much testing I managed to make a docker image which works for me in docker: https://github.com/mrbrdo/facerecognition-external-model-cuda

I am using nvidia driver version 535 on Ubuntu 22.04 with nvidia-container-toolkit (see nvidia install docs)

To use it (port 8090):

cd facerecognition-external-model-cuda
docker build --tag 'facerecognition-external-cuda' .
sudo docker run --rm --runtime=nvidia --gpus all -i -p 8090:5000 -v /PATH/TO/YOUR/api.key:/app/api.key -e FACE_MODEL=4 --name facerecognition-external-cuda facerecognition-external-cuda

guillebot commented 1 month ago

Hi Jan, Matias

As I have a huge image collection and a RTX card, I jumped to try this.

I was able to build and run the cuda image but when I try to use it with face:background_job, I get this error:

 CNN_DETECTOR = dlib.cnn_face_detection_model_v1(DETECTOR_PATH)
RuntimeError: Error while calling cudaGetDevice(&the_device_id) in file
/tmp/pip-wheel-js0yp6db/dlib_666e8ae1fada4f51b17d0e199472f5a8/dlib/cuda/gpu_data.cpp:204.
code: 500, reason: named symbol not found

Could it be that my installed driver is newer?

This is what I get with nvidia-smi on your container:


# nvidia-smi
Sun May 26 18:04:20 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.00                 Driver Version: 560.38         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |         MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2080 ...    On  |   00000000:01:00.0  On |            N/A |
|  0%   41C    P2             53W /  250W |    3818MiB /   8192MiB |  4%      Default |
|                                         |                        |            N/A |
+-----------------------------------------+------------------------+----------------------+

thanks a lot

On Sat, May 25, 2024 at 9:34 PM Jan Berdajs ***@***.***>
wrote:

> After much testing I managed to make a docker image which works for me in
> docker:
> https://github.com/mrbrdo/facerecognition-external-model-cuda
>
> I am using nvidia driver version 535 on Ubuntu 22.04 with
> nvidia-container-toolkit (see nvidia install docs
> <https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html>
> )
>
> —
> Reply to this email directly, view it on GitHub
> <https://github.com/matiasdelellis/facerecognition-external-model/issues/6#issuecomment-2131800335>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAIXVWMT7HSBB2LRQJ4GLS3ZEEUXVAVCNFSM5ZFKKYW2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJTGE4DAMBTGM2Q>
> .
> You are receiving this because you are subscribed to this thread.Message
> ID: <matiasdelellis/facerecognition-external-model/issues/6/2131800335@
> github.com>
>

guillebot commented 1 month ago

Well to add more information, I was able to rebuild the image with

FROM nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04 AS builder

There is no version mismatch anymore, but getting the same error.

Will keep troubleshooting this

regards

mrbrdo commented 1 month ago

@guillebot I'm not sure why you have that issue. Perhaps try with the driver version 535? Is the nvidia-smi output from the host or the container? I don't think the CUDA version should matter, as long as it is not higher than what nvidia-smi says (that's the max supported version). You could even try with CUDA 11.

matiasdelellis / facerecognition-external-model

Does docker-version support GPU #6