infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
18.54k stars 1.88k forks source link

How do I call a model deployed using fastchat? #498

Open ciaoyizhen opened 5 months ago

ciaoyizhen commented 5 months ago

Describe your problem

Reading the related issue, it says to use ollama to start a local model, but https://ollama.com/library doesn't support ChatGLM,or needs a lot of work to support ChatGLM with ollama, also, currently already using fastchat to deploy other apps, so would like to be able to reuse this model, please Can I start a big model using fastchat and wrap the interface myself using fastapi, disguised as ollama? What are the key interfaces I need to provide to ragflow?

Logistic98 commented 3 months ago

相同需求,ollama就是个玩具儿,太难用了,统一按照OpenAI格式接入不就行了,已经成为业内规范了。ollama官方给的模型都是些4bit量化的,想加个自定义模型还要自己转换格式,也没有vllm推理优化。