intel-analytics / ipex-llm-tutorial

Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using ipex-llm
https://github.com/intel-analytics/bigdl
Apache License 2.0
144 stars 38 forks source link

GPU acceleration failed #79

Open doubtfire009 opened 7 months ago

doubtfire009 commented 7 months ago

I use the code here: https://github.com/intel-analytics/ipex-llm-tutorial/blob/original-bigdl-llm/Chinese_Version/ch_6_GPU_Acceleration/6_1_GPU_Llama2-7B.md

But failed. Can you help with this? Thanks.

`from bigdl.llm.transformers import AutoModelForCausalLM, AutoModel from transformers import LlamaTokenizer, AutoTokenizer

chatglm3_6b = 'D:/AI_projects/Langchain-Chatchat/llm_model/THUDM/chatglm2-6b'

model_in_4bit = AutoModel.from_pretrained(pretrained_model_name_or_path=chatglm3_6b, load_in_4bit=True, optimize_model=False) model_in_4bit_gpu = model_in_4bit.to('xpu')

请注意,这里的 AutoModelForCausalLM 是从 bigdl.llm.transformers 导入的

model_in_8bit = AutoModelForCausalLM.from_pretrained(

pretrained_model_name_or_path=chatglm3_6b,

load_in_low_bit="sym_int8",

optimize_model=False

)

model_in_8bit_gpu = model_in_8bit.to('xpu')

tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=chatglm3_6b)

`

The error shows:

(llm_310_whl) D:\AI_projects\ipex-samples>python main-test.py C:\ProgramData\anaconda3\envs\llm_310_whl\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality fromtorchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you havelibjpegorlibpnginstalled before buildingtorchvisionfrom source? warn( 2024-04-07 00:20:03,696 - INFO - intel_extension_for_pytorch auto imported Traceback (most recent call last): File "D:\AI_projects\ipex-samples\main-test.py", line 6, in <module> model_in_4bit = AutoModel.from_pretrained(pretrained_model_name_or_path=chatglm3_6b, File "C:\ProgramData\anaconda3\envs\llm_310_whl\lib\site-packages\bigdl\llm\transformers\model.py", line 320, in from_pretrained model = cls.load_convert(q_k, optimize_model, *args, **kwargs) File "C:\ProgramData\anaconda3\envs\llm_310_whl\lib\site-packages\bigdl\llm\transformers\model.py", line 434, in load_convert model = cls.HF_Model.from_pretrained(*args, **kwargs) File "C:\ProgramData\anaconda3\envs\llm_310_whl\lib\site-packages\transformers\models\auto\auto_factory.py", line 461, in from_pretrained config, kwargs = AutoConfig.from_pretrained( File "C:\ProgramData\anaconda3\envs\llm_310_whl\lib\site-packages\transformers\models\auto\configuration_auto.py", line 986, in from_pretrained trust_remote_code = resolve_trust_remote_code( File "C:\ProgramData\anaconda3\envs\llm_310_whl\lib\site-packages\transformers\dynamic_module_utils.py", line 535, in resolve_trust_remote_code signal.signal(signal.SIGALRM, _raise_timeout_error) AttributeError: module 'signal' has no attribute 'SIGALRM'. Did you mean: 'SIGABRT'?

And the pip list is: accelerate 0.21.0 aiohttp 3.9.3 aiosignal 1.3.1 altair 4.2.2 annotated-types 0.6.0 astor 0.8.1 asttokens 2.4.1 async-timeout 4.0.3 attrs 23.2.0 bigdl-core-xe-21 2.5.0b20240324 bigdl-llm 2.5.0b20240406 blinker 1.7.0 cachetools 5.3.3 certifi 2024.2.2 cffi 1.16.0 charset-normalizer 3.3.2 click 8.1.7 colorama 0.4.6 contourpy 1.2.1 cryptography 42.0.5 cycler 0.12.1 dataclasses-json 0.5.14 decorator 5.1.1 entrypoints 0.4 exceptiongroup 1.2.0 executing 2.0.1 faiss-cpu 1.8.0 filelock 3.13.3 fonttools 4.51.0 frozenlist 1.4.1 fsspec 2024.3.1 gitdb 4.0.11 GitPython 3.1.43 google-ai-generativelanguage 0.2.0 google-api-core 2.18.0 google-auth 2.29.0 google-generativeai 0.1.0 googleapis-common-protos 1.63.0 greenlet 3.0.3 grpcio 1.62.1 grpcio-status 1.48.2 huggingface-hub 0.22.2 idna 3.6 importlib_metadata 7.1.0 intel-extension-for-pytorch 2.1.10+xpu intel-openmp 2024.1.0 ipython 8.23.0 jedi 0.19.1 Jinja2 3.1.3 jsonschema 4.21.1 jsonschema-specifications 2023.12.1 kiwisolver 1.4.5 langchain 0.0.180 markdown-it-py 3.0.0 MarkupSafe 2.1.5 marshmallow 3.21.1 matplotlib 3.8.4 matplotlib-inline 0.1.6 mdurl 0.1.2 mpmath 1.3.0 multidict 6.0.5 mypy-extensions 1.0.0 networkx 3.3 numexpr 2.10.0 numpy 1.26.4 openai 0.27.7 openapi-schema-pydantic 1.2.4 packaging 24.0 pandas 2.2.1 pandasai 0.2.15 parso 0.8.4 pdfminer.six 20231228 pdfplumber 0.11.0 pillow 10.3.0 pip 23.3.1 prompt-toolkit 3.0.43 proto-plus 1.23.0 protobuf 3.20.3 psutil 5.9.8 pure-eval 0.2.2 py-cpuinfo 9.0.0 pyarrow 15.0.2 pyasn1 0.6.0 pyasn1_modules 0.4.0 pycparser 2.22 pydantic 1.10.15 pydantic_core 2.16.3 pydeck 0.8.1b0 Pygments 2.17.2 Pympler 1.0.1 pyparsing 3.1.2 pypdf 3.9.0 pypdfium2 4.28.0 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 pytz 2024.1 PyYAML 6.0.1 referencing 0.34.0 regex 2023.12.25 requests 2.31.0 rich 13.7.1 rpds-py 0.18.0 rsa 4.9 safetensors 0.4.2 sentencepiece 0.2.0 setuptools 68.2.2 six 1.16.0 smmap 5.0.1 SQLAlchemy 2.0.29 stack-data 0.6.3 streamlit 1.22.0 streamlit-chat 0.0.2.2 sympy 1.12 tabulate 0.9.0 tenacity 8.2.3 tiktoken 0.4.0 tokenizers 0.13.3 toml 0.10.2 toolz 0.12.1 torch 2.1.0a0+cxx11.abi torchaudio 2.1.0a0+cxx11.abi torchvision 0.16.0a0+cxx11.abi tornado 6.4 tqdm 4.66.2 traitlets 5.14.2 transformers 4.31.0 typing_extensions 4.11.0 typing-inspect 0.9.0 tzdata 2024.1 tzlocal 5.2 urllib3 2.2.1 validators 0.28.0 watchdog 4.0.0 wcwidth 0.2.13 wheel 0.41.2 yarl 1.9.4 youtube-transcript-api 0.6.0 zipp 3.18.1

and sym_int8 also fails.

Oscilloscope98 commented 7 months ago

Hi @doubtfire009,

For chatglm3-6b, you could refer to https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/chatglm3 regarding how to run it on Intel GPU with IPEX-LLM optimizations.

You could load it with ipex_llm.transformers.AutoModel through

from ipex_llm.transformers import AutoModel

model = AutoModel.from_pretrained(model_path,
                                  load_in_4bit=True,
                                  optimize_model=True,
                                  trust_remote_code=True,
                                  use_cache=True)
model = model.to('xpu')

It seems that you are missing trust_remote_code=True.

You could also refer to here for more information regarding installing IPEX-LLM on Intel GPUs,

Please let us know for any further problems :)