[Bug] [Model Selection] Fail to switch Qwen1.5-14B-Chat to Qwen2-7B-Instruct

Search before asking

[X] I had searched in the issues and found no similar issues.

Operating system information

Windows

Python version information

3.10

DB-GPT version

main

Related scenes

[ ] Chat Data
[ ] Chat Excel
[ ] Chat DB
[ ] Chat Knowledge
[X] Model Management
[ ] Dashboard
[ ] Plugins

Installation Information

Device information

GPU Count: 4 GPU Memory: V100 32GB Each

Models information

LLM: Qwen1.5-14B-Chat → Qwen2-7B-Instruct Embedding: text2vec-large-chinese

What happened

I was able to install and start the entire service with Qwen1.5-14B-Chat but when I switched the LLM to Qwen2-7B-Instruct, there came an error:

2024-09-30 14:10:52 WIN-MEUAL5THNML dbgpt.model.llm_out.hf_chat_llm[10332] INFO Predict with parameters: {'max_length': 131072, 'temperature': 0.6, 'streamer': <transformers.generation.streamers.TextIteratorStreamer object at 0x0000023D798BDBD0>, 'top_p': 1.0, 'do_sample': True}
custom_stop_words: []
The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Exception in thread Thread-9 (generate):
Traceback (most recent call last):
  File "C:\Users\BOCAI\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "C:\Users\BOCAI\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\BOCAI\Desktop\DB-GPT-510\dbgpt_env\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\BOCAI\Desktop\DB-GPT-510\dbgpt_env\lib\site-packages\transformers\generation\utils.py", line 1989, in generate
    result = self._sample(
  File "C:\Users\BOCAI\Desktop\DB-GPT-510\dbgpt_env\lib\site-packages\transformers\generation\utils.py", line 2969, in _sample
    next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

What you expected to happen

Successfully switch to another model like nothing happened but with brand new model performance

How to reproduce

1 Install the project with corresponding dependencies 2 Make sure there's Qwen1.5-14B-Chat and Qwen2-7B-Instruct model folder under ./model directory 3 change model selection in .env file from Qwen1.5-14B-Chat to Qwen2-7B-Instruct 4 run python dbgpt/app/dbgpt_server.py as usual

Additional context

No response

Are you willing to submit PR?

[ ] Yes I am willing to submit a PR!

eosphoros-ai / DB-GPT