eosphoros-ai / DB-GPT

AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
http://docs.dbgpt.cn
MIT License
13.75k stars 1.86k forks source link

local code generate slowly #2034

Open yx1405585468 opened 1 month ago

yx1405585468 commented 1 month ago

Search before asking

Description

Hi, I deployed dbgpt on my own computer and loaded a local model Qwen2 0.5B, it reasoned very fast in dbgpt, I asked questions and it answered them very fast, however I'm in pycharm, and I'm getting it to work by writing code like.

from transformers import pipeline

messages = [ {“role”: “user”, “content”: “Who are you?”}, ] ] pipe = pipeline(“text-generation”, model=“Qwen/Qwen2-0.5B”) pipe(messages)

It's reasoning very slowly, although I'm sure cuda is being used to speed it up. I don't understand why this is the case, and I'd like to achieve very fast reasoning locally as well

Use case

No response

Related issues

No response

Feature Priority

None

Are you willing to submit PR?