netease-youdao / QAnything

Question and Answer based on Anything.
https://qanything.ai
GNU Affero General Public License v3.0
11.38k stars 1.1k forks source link

是否考虑支持 cpu 推理 #196

Open ihacku opened 5 months ago

ihacku commented 5 months ago

一些场景比如工单 bot 回复不需要实时回复,内存够的情况下,cpu 推理慢一点能出结果也可以

Ma-Dan commented 5 months ago

https://github.com/Ma-Dan/QAnything/blob/cpu/%E6%9C%AC%E5%9C%B0CPU%E9%83%A8%E7%BD%B2%E5%92%8C%E8%B0%83%E8%AF%95%E6%96%B9%E6%B3%95.txt 试试我这个方法,milvus和mysql还是docker运行,3个模型和前后端服务都在本地了

dubeno commented 5 months ago

You can run llamafile(GGUF) on your Windows PC or other OS ,It's support CPU infer and provide an OpenAI Compatibility API。

You should modify this file (QAnything/tree/master/qanything_kernel/connector/llm /llm_for_online.py) to your local llm server endpoint。

Ma-Dan commented 5 months ago

我试了一上午就切换回GPU了

successren commented 5 months ago

we are now support python only environment installation, here are documents: https://github.com/netease-youdao/QAnything?tab=readme-ov-file#installationpure-python-environment

python only installation support Mac and CPU only machine.

please try again!

GreatStep commented 2 weeks ago

在线openai方式,用其他llm替代openai则可。 可以cpu运行,就是慢很多,毕竟要拉起llm