turboderp / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
MIT License
2.77k stars 220 forks source link

list index out of range #292

Closed j2l closed 1 year ago

j2l commented 1 year ago

On my Ubuntu 22.04 (PopOS) test_benchmark throws:

/home/pm/Documents/github/dockeronly/exllama-master/exllama/lib/python3.10/site-packages/torch/cuda/__init__.py:138: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 803: system has unsupported display driver / cuda driver combination (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
  return torch._C._cuda_getDeviceCount() > 0
No CUDA runtime is found, using CUDA_HOME='/usr'
/home/pm/Documents/github/dockeronly/exllama-master/exllama/lib/python3.10/site-packages/torch/cuda/__init__.py:611: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
  File "/home/pm/Documents/github/dockeronly/exllama-master/test_benchmark_inference.py", line 1, in <module>
    from model import ExLlama, ExLlamaCache, ExLlamaConfig
  File "/home/pm/Documents/github/dockeronly/exllama-master/model.py", line 12, in <module>
    import cuda_ext
  File "/home/pm/Documents/github/dockeronly/exllama-master/cuda_ext.py", line 43, in <module>
    exllama_ext = load(
  File "/home/pm/Documents/github/dockeronly/exllama-master/exllama/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1308, in load
    return _jit_compile(
  File "/home/pm/Documents/github/dockeronly/exllama-master/exllama/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/pm/Documents/github/dockeronly/exllama-master/exllama/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1810, in _write_ninja_file_and_build_library
    _write_ninja_file_to_build_library(
  File "/home/pm/Documents/github/dockeronly/exllama-master/exllama/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2199, in _write_ninja_file_to_build_library
    cuda_flags = common_cflags + COMMON_NVCC_FLAGS + _get_cuda_arch_flags()
  File "/home/pm/Documents/github/dockeronly/exllama-master/exllama/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1980, in _get_cuda_arch_flags
    arch_list[-1] += '+PTX'
IndexError: list index out of range

upgrading nvidia threw:

...
ernelstub.Installer : INFO     Making entry file for Pop!_OS
Errors were encountered while processing:
 nvidia-dkms-520
 cuda-drivers-520
 cuda-drivers
 nvidia-driver-520
 cuda-runtime-11-8
 cuda-11-8
 cuda-demo-suite-11-8
 cuda
E: Sub-process /usr/bin/dpkg returned an error code (1)

Any idea?

j2l commented 1 year ago

Actually, update from https://github.com/turboderp/exllama/issues/65#issuecomment-1595872644 didn't work properly. Desktop couldn't boot with native resolution anymore, downgraded to 800x600. After some time, I noticed I was using driver 535, more recent than needed. I managed to get back on 535 (software & updates > Additional Drivers > NVIDIA ... 535).

Closing this one, since I now have another problem