是否考虑支持 cpu 推理

netease-youdao / QAnything

Question and Answer based on Anything.

https://qanything.ai

GNU Affero General Public License v3.0

11.38k stars 1.1k forks source link

Open ihacku opened 5 months ago

ihacku commented 5 months ago

一些场景比如工单 bot 回复不需要实时回复，内存够的情况下，cpu 推理慢一点能出结果也可以

Ma-Dan commented 5 months ago

dubeno commented 5 months ago

You can run llamafile(GGUF) on your Windows PC or other OS ,It's support CPU infer and provide an OpenAI Compatibility API。

You should modify this file (QAnything/tree/master/qanything_kernel/connector/llm /llm_for_online.py) to your local llm server endpoint。

Ma-Dan commented 5 months ago

我试了一上午就切换回GPU了

successren commented 5 months ago

python only installation support Mac and CPU only machine.

please try again!

GreatStep commented 2 weeks ago

在线openai方式，用其他llm替代openai则可。可以cpu运行，就是慢很多，毕竟要拉起llm