[BUG]docker cannot found cuda-toolkit

OS

Windows

GPU Library

CUDA 12.x

Python version

3.10

Describe the bug

~/tabbyAPI# docker run --gpus all 3fd2b9e9cf8fd03dc3601a7e1712611006e507a852e160933cda81f0f37a0036 (docker image:ghcr.io/theroyallab/tabbyapi:latest)
/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py:128: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 500: named symbol not found (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
  return torch._C._cuda_getDeviceCount() > 0
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
Traceback (most recent call last):
  File "/app/main.py", line 11, in <module>
    from common import gen_logging, sampling, model
  File "/app/common/model.py", line 19, in <module>
    from backends.exllamav2.model import ExllamaV2Container
  File "/app/backends/exllamav2/model.py", line 12, in <module>
    from exllamav2 import (
  File "/usr/local/lib/python3.10/dist-packages/exllamav2/__init__.py", line 3, in <module>
    from exllamav2.model import ExLlamaV2
  File "/usr/local/lib/python3.10/dist-packages/exllamav2/model.py", line 41, in <module>
    from exllamav2.attn import ExLlamaV2Attention, has_flash_attn, has_xformers
  File "/usr/local/lib/python3.10/dist-packages/exllamav2/attn.py", line 38, in <module>
    is_ampere_or_newer_gpu = any(torch.cuda.get_device_properties(i).major >= 8 for i in range(torch.cuda.device_count()))
  File "/usr/local/lib/python3.10/dist-packages/exllamav2/attn.py", line 38, in <genexpr>
    is_ampere_or_newer_gpu = any(torch.cuda.get_device_properties(i).major >= 8 for i in range(torch.cuda.device_count()))
  File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 465, in get_device_properties
    _lazy_init()  # will define _get_device_properties
  File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 314, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 500: named symbol not found

when I input nvcc -V /bin/sh: 1: nvcc: not found This doesn't seem to be my fault, because

docker run -it --gpus all 0dd75116a8ce8e6dd3cf6db1eb249d14f07f4115a1d35aaeb29bedfe8bc383f0/bin/bash（docker image:pytorch/pytorch:2.2.0-cuda12.1-cudnn8-devel） 

==========
== CUDA ==
==========

CUDA Version 12.1.1

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

root@58d50a5fd49e:/workspace# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

Nvidia container runtime can start normally in another Docker image

Reproduction steps

git clone https://github.com/theroyallab/tabbyAPI
docker compose -f docker/docker-compose.yml up

Expected behavior

docker run on gpu model successful

Logs

No response

Additional context