I was able to install and start the entire service with Qwen1.5-14B-Chat but when I switched the LLM to Qwen2-7B-Instruct, there came an error:
2024-09-30 14:10:52 WIN-MEUAL5THNML dbgpt.model.llm_out.hf_chat_llm[10332] INFO Predict with parameters: {'max_length': 131072, 'temperature': 0.6, 'streamer': <transformers.generation.streamers.TextIteratorStreamer object at 0x0000023D798BDBD0>, 'top_p': 1.0, 'do_sample': True}
custom_stop_words: []
The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Exception in thread Thread-9 (generate):
Traceback (most recent call last):
File "C:\Users\BOCAI\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner
self.run()
File "C:\Users\BOCAI\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\BOCAI\Desktop\DB-GPT-510\dbgpt_env\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\BOCAI\Desktop\DB-GPT-510\dbgpt_env\lib\site-packages\transformers\generation\utils.py", line 1989, in generate
result = self._sample(
File "C:\Users\BOCAI\Desktop\DB-GPT-510\dbgpt_env\lib\site-packages\transformers\generation\utils.py", line 2969, in _sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
What you expected to happen
Successfully switch to another model like nothing happened but with brand new model performance
How to reproduce
1 Install the project with corresponding dependencies
2 Make sure there's Qwen1.5-14B-Chat and Qwen2-7B-Instruct model folder under ./model directory
3 change model selection in .env file from Qwen1.5-14B-Chat to Qwen2-7B-Instruct
4 run python dbgpt/app/dbgpt_server.py as usual
Search before asking
Operating system information
Windows
Python version information
3.10
DB-GPT version
main
Related scenes
Installation Information
[X] Installation From Source
[ ] Docker Installation
[ ] Docker Compose Installation
[ ] Cluster Installation
[ ] AutoDL Image
[ ] Other
Device information
GPU Count: 4 GPU Memory: V100 32GB Each
Models information
LLM: Qwen1.5-14B-Chat → Qwen2-7B-Instruct Embedding: text2vec-large-chinese
What happened
I was able to install and start the entire service with Qwen1.5-14B-Chat but when I switched the LLM to Qwen2-7B-Instruct, there came an error:
What you expected to happen
Successfully switch to another model like nothing happened but with brand new model performance
How to reproduce
1 Install the project with corresponding dependencies 2 Make sure there's Qwen1.5-14B-Chat and Qwen2-7B-Instruct model folder under ./model directory 3 change model selection in .env file from Qwen1.5-14B-Chat to Qwen2-7B-Instruct 4 run python dbgpt/app/dbgpt_server.py as usual
Additional context
No response
Are you willing to submit PR?