nextcloud / context_chat_backend

GNU Affero General Public License v3.0
6 stars 5 forks source link

[bug]: hwdetect.sh does not detect GPUs #66

Closed ga-it closed 2 months ago

ga-it commented 2 months ago

Describe the bug The hwdetect script greps for "VGA.*NVIDIA" this may not detect all GPUs. It does not pick up Tesla P40s. I override this with a persistent compose.yaml, but obviously not optimal.

To Reproduce Steps to reproduce the behavior:

  1. Run docker startup:
  2. docker run -d --gpus '"device=0,1"' -v /opt/context_chat_backend/config.gpu.yaml:/app/config.yaml -v /data/context_chat_backend/persistent_storage:/app/persistent_storage -v /opt/context_chat_backend/context_chat_backend:/app/context_chat_backend -v /var/run/docker.sock:/var/run/docker.sock --env-file /opt/context_chat_backend.env --name context_chat_backend -p 10034:10034 context_chat_backend_dev:latest

Expected behavior Correct detection of GPUs

Relevant code

``` # if the COMPUTE_DEVICE env var is not set, try to detect the hardware if [ -z "$accel" ]; then echo "Detecting hardware..." lspci_out=$(lspci) if echo "$lspci_out" | grep -q "VGA.*NVIDIA"; then accel="cuda" else accel="cpu" fi echo "Detected hardware: $accel" fi ```

context_chat_backend startup

``` Detecting hardware... Detected hardware: cpu ```

GPUs per LSPCI

``` lspci | grep NVIDIA 03:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1) 82:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1) ```

Setup Details (please complete the following information):

kyteinsky commented 2 months ago

hi, thanks for the report. This should be good enough I guess

if echo "$lspci_out" | grep -q -E "(VGA|3D).*NVIDIA"; then