Unkown cuda error - Githubissues

AceBeaker2 commented 1 year ago

Traceback (most recent call last):
  File "/home/orion/AI-Horde-Worker/llama.cpp/pyllama/inference.py", line 82, in <module>
    run(
  File "/home/orion/AI-Horde-Worker/llama.cpp/pyllama/inference.py", line 50, in run
    generator = load(
  File "/home/orion/AI-Horde-Worker/llama.cpp/pyllama/inference.py", line 33, in load
    model = Transformer(model_args)
  File "/home/orion/AI-Horde-Worker/llama.cpp/pyllama/llama/model_single.py", line 195, in __init__
    self.tok_embeddings = nn.Embedding(params.vocab_size, params.dim)
  File "/home/orion/.local/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 142, in __init__
    self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs),
  File "/home/orion/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 247, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW

python3 pyllama/inference.py --ckpt_dir models/7B/ --tokenizer_path models/tokenizer.model

Environment is ubuntu 22, cuda 12.1, rtx 3060 ti

Bradley-Butcher commented 1 year ago

Can you show the output of nvidia-smi?

AceBeaker2 commented 1 year ago

(llama) orion@skynet:~$ nvidia-smi
Fri Mar 17 20:30:47 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3060 Ti      On | 00000000:09:00.0 Off |                  N/A |
|  0%   38C    P8               18W / 200W|    157MiB /  8192MiB |      3%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2019      G   /usr/lib/xorg/Xorg                           71MiB |
|    0   N/A  N/A      2309      G   cinnamon                                     33MiB |
|    0   N/A  N/A      3444      G   ...53317852,5022907119068916348,131072       51MiB |
+---------------------------------------------------------------------------------------+

Can you show the output of nvidia-smi?

AceBeaker2 commented 1 year ago

I managed to fix it by rebooting, but I'm getting a new error now:

(llama) orion@skynet:~/AI-Horde-Worker/llama.cpp$ python3 pyllama/llama/llama_quant.py models/7B/ c4 --wbits 16 --save pyllama-7B8b.pt
Traceback (most recent call last):
  File "/home/orion/AI-Horde-Worker/llama.cpp/pyllama/llama/llama_quant.py", line 6, in <module>
    from gptq import (
  File "/home/orion/.local/lib/python3.10/site-packages/gptq/__init__.py", line 9, in <module>
    from .gptq import GPTQ
  File "/home/orion/.local/lib/python3.10/site-packages/gptq/gptq.py", line 5, in <module>
    from .quant import quantize
  File "/home/orion/.local/lib/python3.10/site-packages/gptq/quant.py", line 4, in <module>
    from quant_cuda import matvmul2, matvmul3, matvmul4, matvmul8, matvmul16
ModuleNotFoundError: No module named 'quant_cuda'

juncongmoo / pyllama

Unkown cuda error #36