winter-JX commented 5 hours ago

System Info / 系統信息

CUDA 12.4 transformers 4.46.3 python 3.10 Ubuntu 20.04

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

[ ] docker / docker
[X] pip install / 通过 pip install 安装
[ ] installation from source / 从源码安装

Version info / 版本信息

xinference 1.0.0 xinference-client 0.13.3

The command used to start Xinference / 用以启动 xinference 的命令

xinference-local --host 0.0.0.0 --port 9997

Reproduction / 复现过程

import streamlit as st

from langchain.llms import Xinference

from langchain_community.llms import Xinference from langchain.embeddings import XinferenceEmbeddings from langchain.prompts import PromptTemplate from langchain.chains import LLMChain from langchain.document_loaders import TextLoader from langchain.text_splitter import CharacterTextSplitter from langchain.vectorstores import Chroma

Customize the layout

st.set_page_config(page_title="Local AI Chat Powered by Xinference", page_icon="🤖", layout="wide")

Write uploaded file in temp dir

def write_text_file(content, file_path): try: with open(file_path, 'w') as file: file.write(content) return True except Exception as e: print(f"Error occurred while writing the file: {e}") return False

Prepare prompt template

prompt_template = """ 使用下面的上下文来回答问题。如果你不知道答案，就说你不知道，不要编造答案。 {context} 问题: {question} 回答: """ prompt = PromptTemplate(template=prompt_template, input_variables=["context", "question"])

Initialize the Xinference LLM & Embeddings

xinference_server_url = "" llm = Xinference(server_url=xinference_server_url, model_uid="qwen2.5_1.5B") embeddings = XinferenceEmbeddings(server_url=xinference_server_url, model_uid="bge-large-zh-v1.5") llm_chain = LLMChain(llm=llm, prompt=prompt)

st.title("📄文档对话") uploaded_file = st.file_uploader("上传文件", type="txt")

if uploaded_file is not None: content = uploaded_file.read().decode('utf-8') file_path = "/tmp/file.txt" write_text_file(content, file_path)

loader = TextLoader(file_path)
docs = loader.load()    
text_splitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=0)
texts = text_splitter.split_documents(docs)
db = Chroma.from_documents(texts, embeddings)    
st.success("上传文档成功")

# Query through LLM    
question = st.text_input("提问", placeholder="请问我任何关于文章的问题", disabled=not uploaded_file)    
if question:
    similar_doc = db.similarity_search(question, k=1)
    st.write("相关上下文：")
    st.write(similar_doc)
    context = similar_doc[0].page_content
    query_llm = LLMChain(llm=llm, prompt=prompt)
    response = query_llm.run({"context": context, "question": question})        
    st.write(f"回答：{response}")

llm = Xinference(server_url=xinference_server_url, model_uid="qwen2.5_1.5B") 这行代码出错报错信息： pydantic_core._pydantic_core.ValidationError: 1 validation error for Xinference client Field required [type=missing, input_value={'server_url': 'http://12...wargs': {'type': 'llm'}}, input_type=dict] For further information visit https://errors.pydantic.dev/2.7/v/missing

Expected behavior / 期待表现

代码就是 XInference 文档里提供的通过 langchain 调用实现文档聊天的例程代码，想知道这是什么原因以及如何解决

qinxuye commented 4 hours ago

可以尝试用 OpenAI 的方式连 Xinference，协议是兼容的。

winter-JX commented 4 hours ago