mistralai / mistral-inference

Official inference library for Mistral models
https://mistral.ai/
Apache License 2.0
9.16k stars 804 forks source link

CUDA EXTENSION NOT INSTALLED nvcr.io/nvidia/pytorch:22.12-py3 #136

Closed skr3178 closed 3 months ago

skr3178 commented 3 months ago

Also wrote on https://github.com/AutoGPTQ/AutoGPTQ/issues/598

!nvidia-smi`

Sun Mar 17 12:04:03 2024
+-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: 12.4 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3060 Off | 00000000:01:00.0 On | N/A | | 0% 47C P8 24W / 170W | 418MiB / 12288MiB | 14% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| +-----------------------------------------------------------------------------------------+ !nvcc --version !nvcc --version !nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Tue_Feb__7_19:32:13_PST_2023 Cuda compilation tools, release 12.1, V12.1.66 Build cuda_12.1.r12.1/compiler.32415258_0

import torch

print(torch.__version__)

2.1.0a0+fe05266

import torch 
print(torch.version.cuda)

12.1

!pip list

Package Version


absl-py 1.4.0 accelerate 0.28.0 aiohttp 3.8.4 aiosignal 1.3.1 apex 0.1 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 asttokens 2.2.1 astunparse 1.6.3 async-timeout 4.0.2 attrs 22.2.0 audioread 3.0.0 auto-gptq 0.6.0 backcall 0.2.0 beautifulsoup4 4.12.2 bitsandbytes 0.43.0 bleach 6.0.0 blis 0.7.9 cachetools 5.3.0 catalogue 2.0.8 certifi 2022.12.7 cffi 1.15.1 charset-normalizer 3.1.0 click 8.1.3 cloudpickle 2.2.1 cmake 3.24.1.1 coloredlogs 15.0.1 comm 0.1.3 confection 0.0.4 contourpy 1.0.7 cubinlinker 0.2.2+2.g5f51201 cuda-python 12.1.0rc5+1.g808384c cudf 23.2.0 cugraph 23.2.0 cugraph-dgl 23.2.0 cugraph-service-client 23.2.0 cugraph-service-server 23.2.0 cuml 23.2.0 cupy-cuda12x 12.0.0b3 cycler 0.11.0 cymem 2.0.7 Cython 0.29.34 dask 2023.1.1 dask-cuda 23.2.0 dask-cudf 23.2.0 datasets 2.18.0 debugpy 1.6.7 decorator 5.1.1 defusedxml 0.7.1 dill 0.3.8 distributed 2023.1.1 exceptiongroup 1.1.1 execnet 1.9.0 executing 1.2.0 expecttest 0.1.3 fastjsonschema 2.16.3 fastrlock 0.8.1 filelock 3.11.0 flash-attn 0.2.8.dev0 fonttools 4.39.3 frozenlist 1.3.3 fsspec 2024.2.0 gast 0.4.0 gekko 1.0.7 google-auth 2.17.3 google-auth-oauthlib 0.4.6 graphsurgeon 0.4.6 grpcio 1.53.0 HeapDict 1.0.1 huggingface-hub 0.21.4 humanfriendly 10.0 hypothesis 5.35.1 idna 3.4 importlib-metadata 6.3.0 importlib-resources 5.12.0 iniconfig 2.0.0 intel-openmp 2021.4.0 ipykernel 6.22.0 ipython 8.12.0 ipython-genutils 0.2.0 ipywidgets 8.1.2 jedi 0.18.2 Jinja2 3.1.2 joblib 1.2.0 json5 0.9.11 jsonschema 4.17.3 jupyter_client 8.2.0 jupyter_core 5.3.0 jupyter-tensorboard 0.2.0 jupyterlab 2.3.2 jupyterlab-pygments 0.2.2 jupyterlab-server 1.2.0 jupyterlab_widgets 3.0.10 jupytext 1.14.5 kiwisolver 1.4.4 langcodes 3.3.0 librosa 0.9.2 lit 16.0.1 llvmlite 0.39.1 locket 1.0.0 Markdown 3.4.3 markdown-it-py 2.2.0 MarkupSafe 2.1.2 matplotlib 3.7.1 matplotlib-inline 0.1.6 mdit-py-plugins 0.3.5 mdurl 0.1.2 mistune 2.0.5 mkl 2021.1.1 mkl-devel 2021.1.1 mkl-include 2021.1.1 mock 5.0.1 mpmath 1.3.0 msgpack 1.0.5 multidict 6.0.4 multiprocess 0.70.16 murmurhash 1.0.9 nbclient 0.7.3 nbconvert 7.3.1 nbformat 5.8.0 nest-asyncio 1.5.6 networkx 2.6.3 notebook 6.4.10 numba 0.56.4+1.g536eedd6e numpy 1.22.2 nvidia-dali-cuda110 1.24.0 nvidia-pyindex 1.0.9 nvtx 0.2.5 oauthlib 3.2.2 onnx 1.13.1 opencv 4.6.0 optimum 1.17.1 packaging 23.0 pandas 1.5.2 pandocfilters 1.5.0 parso 0.8.3 partd 1.3.0 pathy 0.10.1 peft 0.9.0 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.2.0 pip 21.2.4 pkgutil_resolve_name 1.3.10 platformdirs 3.2.0 pluggy 1.0.0 ply 3.11 polygraphy 0.46.2 pooch 1.7.0 preshed 3.0.8 prettytable 3.7.0 prometheus-client 0.16.0 prompt-toolkit 3.0.38 protobuf 3.20.3 psutil 5.9.4 ptxcompiler 0.7.0+27.gb446f00 ptyprocess 0.7.0 pure-eval 0.2.2 pyarrow 15.0.1 pyarrow-hotfix 0.6 pyasn1 0.4.8 pyasn1-modules 0.2.8 pybind11 2.10.4 pycocotools 2.0+nv0.7.1 pycparser 2.21 pydantic 1.10.7 Pygments 2.15.0 pylibcugraph 23.2.0 pylibcugraphops 23.2.0 pylibraft 23.2.0 pynvml 11.4.1 pyparsing 3.0.9 pyrsistent 0.19.3 pytest 7.3.1 pytest-rerunfailures 11.1.2 pytest-shard 0.1.2 pytest-xdist 3.2.1 python-dateutil 2.8.2 python-hostlist 1.23.0 pytorch-quantization 2.1.2 pytz 2023.3 PyYAML 6.0 pyzmq 25.0.2 raft-dask 23.2.0 regex 2023.3.23 requests 2.28.2 requests-oauthlib 1.3.1 resampy 0.4.2 rmm 23.2.0 rouge 1.0.1 rsa 4.9 safetensors 0.4.2 scikit-learn 1.2.0 scipy 1.10.1 seaborn 0.12.2 Send2Trash 1.8.0 sentencepiece 0.2.0 setuptools 65.5.1 six 1.16.0 smart-open 6.3.0 sortedcontainers 2.4.0 soundfile 0.12.1 soupsieve 2.4 spacy 3.5.2 spacy-legacy 3.0.12 spacy-loggers 1.0.4 sphinx-glpi-theme 0.3 srsly 2.4.6 stack-data 0.6.2 strings-udf 23.2.0 sympy 1.11.1 tbb 2021.9.0 tblib 1.7.0 tensorboard 2.9.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.1 tensorrt 8.6.1 terminado 0.17.1 thinc 8.1.9 threadpoolctl 3.1.0 thriftpy2 0.4.16 tinycss2 1.2.1 tokenizers 0.15.2 toml 0.10.2 tomli 2.0.1 toolz 0.12.0 torch 2.1.0a0+fe05266 torch-tensorrt 1.4.0.dev0 torchtext 0.13.0a0+fae8e8c torchvision 0.15.0a0 tornado 6.2 tqdm 4.65.0 traitlets 5.9.0 transformer-engine 0.7.0 transformers 4.38.2 treelite 3.1.0 treelite-runtime 3.1.0 triton 2.0.0 typer 0.7.0 types-dataclasses 0.6.6 typing_extensions 4.5.0 ucx-py 0.30.0 uff 0.6.9 urllib3 1.26.15 wasabi 1.1.1 wcwidth 0.2.6 webencodings 0.5.1 Werkzeug 2.2.3 wheel 0.40.0 widgetsnbextension 4.0.10 xdoctest 1.0.2 xgboost 1.7.1 xxhash 3.4.1 yarl 1.8.2 zict 2.2.0 zipp 3.15.0 WARNING: You are using pip version 21.2.4; however, version 24.0 is available. You should consider upgrading via the '/usr/bin/python -m pip install --upgrade pip' command.

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline from peft import prepare_model_for_kbit_training from peft import LoraConfig, get_peft_model from datasets import load_dataset import transformers

model_name = "TheBloke/Mistral-7B-Instruct-v0.2-GPTQ"
model = AutoModelForCausalLM.from_pretrained(model_name,
                                             device_map="auto", # automatically figures out how to best use CPU + GPU for loading model
                                             trust_remote_code=False, # prevents running custom model files on your machine
                                             revision="main") # which version of model to use in repo

CUDA extension not installed. CUDA extension not installed.

skr3178 commented 3 months ago

https://github.com/AutoGPTQ/AutoGPTQ/issues/598