prompt generation very slow

huchenlei / ComfyUI_omost

ComfyUI implementation of Omost

Apache License 2.0

338 stars 21 forks source link

prompt generation very slow #41

Open sipie800 opened 2 weeks ago

sipie800 commented 2 weeks ago

For 4096 token(which is forced by omost), use llama-3 model at 4090, it take 120s to complete prompt. And it take only 7s for SD. It's a big gap. How can we accelerate the local GPT?

zhaijunxiao commented 2 weeks ago

The official LLM's method runs slow. You can speed up by using TGI or llama.cpp.