xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5.11k stars 413 forks source link

BUG: error raised by launch model chatglm-pytorch-6b 8 bit #742

Closed liyanz1377 closed 2 months ago

liyanz1377 commented 10 months ago

Describe the bug

Ubuntu2004 py3.11 xinference latest

To Reproduce

To help us to reproduce this bug, please provide information below: 2023-12-08 11:41:10,825 - modelscope - INFO - PyTorch version 2.1.0 Found. 2023-12-08 11:41:10,826 - modelscope - INFO - Loading ast index from /home/gitlab-runner/.cache/modelscope/ast_indexer 2023-12-08 11:41:10,848 - modelscope - INFO - Loading done! Current index file version is 1.9.5, with md5 ea847914eff50e2c71d1b2e790a604b9 and a total number of 945 components indexed 2023-12-08 11:41:10,930 xinference.model.llm.llm_family 1942993 INFO Caching from Hugging Face: THUDM/chatglm-6b 2023-12-08 11:41:10,930 xinference.model.llm.llm_family 1942993 INFO Cache /home/gitlab-runner/.xinference/cache/chatglm-pytorch-6b exists 2023-12-08 11:41:11,532 xinference.core.worker 1942993 ERROR Failed to load model 9f4ddbe0-957b-11ee-893d-ed64cc43fd58-1-0 Traceback (most recent call last): File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/worker.py", line 336, in launch_builtin_model await model_ref.load() File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/api.py", line 306, in on_receive return await super().on_receive(message) # type: ignore ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 558, in on_receive__ raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 524, in xoscar.core._BaseActor.__on_receive result = func(*args, kwargs) ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/model.py", line 166, in load self._model.load() ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/model/llm/pytorch/core.py", line 170, in load self._model, self._tokenizer = load_compress_model( ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/model/llm/pytorch/compression.py", line 128, in load_compress_model model = AutoModelForCausalLM.from_config(config, trust_remote_code=True) ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 448, in from_config raise ValueError( ValueError: [address=0.0.0.0:39293, pid=1947936] Unrecognized configuration class <class 'transformers_modules.chatglm-pytorch-6b.configuration_chatglm.ChatGLMConfig'> for this kind of AutoModel: AutoModelForCausalLM. Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, ElectraConfig, ErnieConfig, FalconConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, LlamaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MptConfig, MusicgenConfig, MvpConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, TransfoXLConfig, TrOCRConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig. 2023-12-08 11:41:11,572 xinference.api.restful_api 1942825 ERROR [address=0.0.0.0:39293, pid=1947936] Unrecognized configuration class <class 'transformers_modules.chatglm-pytorch-6b.configuration_chatglm.ChatGLMConfig'> for this kind of AutoModel: AutoModelForCausalLM. Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, ElectraConfig, ErnieConfig, FalconConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, LlamaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MptConfig, MusicgenConfig, MvpConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, TransfoXLConfig, TrOCRConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig. Traceback (most recent call last): File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/api/restful_api.py", line 417, in launch_model model_uid = await (await self._get_supervisor_ref()).launch_builtin_model( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/api.py", line 306, in on_receive return await super().on_receive(message) # type: ignore ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 558, in on_receive__ raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive result = await result ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/supervisor.py", line 476, in launch_builtin_model await _launch_one_model(rep_model_uid) ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/supervisor.py", line 445, in _launch_one_model await worker_ref.launch_builtin_model( ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 284, in pyx_actor_method_wrapper async with lock: File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper result = await result ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/utils.py", line 33, in wrapped ret = await func(*args, **kwargs) ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/worker.py", line 336, in launch_builtin_model await model_ref.load() ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xoscar/api.py", line 306, in on_receive return await super().on_receive(message) # type: ignore ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive async with self._lock: ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive__ with debug_async_timeout('actor_lock_timeout', ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 524, in xoscar.core._BaseActor.on_receive__ result = func(*args, kwargs) ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/model.py", line 166, in load self._model.load() ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/model/llm/pytorch/core.py", line 170, in load self._model, self._tokenizer = load_compress_model( ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/xinference/model/llm/pytorch/compression.py", line 128, in load_compress_model model = AutoModelForCausalLM.from_config(config, trust_remote_code=True) ^^^^^^^^^^^^^^^^^ File "/home/gitlab-runner/miniconda3/envs/xinference/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 448, in from_config raise ValueError( ValueError: [address=0.0.0.0:39293, pid=1947936] Unrecognized configuration class <class 'transformers_modules.chatglm-pytorch-6b.configuration_chatglm.ChatGLMConfig'> for this kind of AutoModel: AutoModelForCausalLM. Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, ElectraConfig, ErnieConfig, FalconConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, LlamaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MptConfig, MusicgenConfig, MvpConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, TransfoXLConfig, TrOCRConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig.

ChengjieLi28 commented 10 months ago

Hi @liyanz1377 . Please try to downgrade your transformers lib to version 4.33.2.

pip install transformers==4.33.2
liyanz1377 commented 10 months ago

Hi @liyanz1377 . Please try to downgrade your transformers lib to version 4.33.2.

pip install transformers==4.33.2

Yes, it matches the version (xinference) gitlab-runner@BJSW-T057:~/.xinference/cache$ pip list |grep transformers ctransformers 0.2.27 sentence-transformers 2.2.2 transformers 4.33.2 transformers-stream-generator 0.0.4

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] commented 2 months ago

This issue was closed because it has been inactive for 5 days since being marked as stale.