xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5.56k stars 457 forks source link

通过 langchain 调用Xinference 大模型时报错 #2621

Open winter-JX opened 5 hours ago

winter-JX commented 5 hours ago

System Info / 系統信息

CUDA 12.4 transformers 4.46.3 python 3.10 Ubuntu 20.04

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

xinference 1.0.0 xinference-client 0.13.3

The command used to start Xinference / 用以启动 xinference 的命令

xinference-local --host 0.0.0.0 --port 9997

Reproduction / 复现过程

import streamlit as st

from langchain.llms import Xinference

from langchain_community.llms import Xinference from langchain.embeddings import XinferenceEmbeddings from langchain.prompts import PromptTemplate from langchain.chains import LLMChain from langchain.document_loaders import TextLoader from langchain.text_splitter import CharacterTextSplitter from langchain.vectorstores import Chroma

Customize the layout

st.set_page_config(page_title="Local AI Chat Powered by Xinference", page_icon="🤖", layout="wide")

Write uploaded file in temp dir

def write_text_file(content, file_path): try: with open(file_path, 'w') as file: file.write(content) return True except Exception as e: print(f"Error occurred while writing the file: {e}") return False

Prepare prompt template

prompt_template = """ 使用下面的上下文来回答问题。 如果你不知道答案,就说你不知道,不要编造答案。 {context} 问题: {question} 回答: """ prompt = PromptTemplate(template=prompt_template, input_variables=["context", "question"])

Initialize the Xinference LLM & Embeddings

xinference_server_url = "" llm = Xinference(server_url=xinference_server_url, model_uid="qwen2.5_1.5B") embeddings = XinferenceEmbeddings(server_url=xinference_server_url, model_uid="bge-large-zh-v1.5") llm_chain = LLMChain(llm=llm, prompt=prompt)

st.title("📄文档对话") uploaded_file = st.file_uploader("上传文件", type="txt")

if uploaded_file is not None: content = uploaded_file.read().decode('utf-8') file_path = "/tmp/file.txt" write_text_file(content, file_path)

loader = TextLoader(file_path)
docs = loader.load()    
text_splitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=0)
texts = text_splitter.split_documents(docs)
db = Chroma.from_documents(texts, embeddings)    
st.success("上传文档成功")

# Query through LLM    
question = st.text_input("提问", placeholder="请问我任何关于文章的问题", disabled=not uploaded_file)    
if question:
    similar_doc = db.similarity_search(question, k=1)
    st.write("相关上下文:")
    st.write(similar_doc)
    context = similar_doc[0].page_content
    query_llm = LLMChain(llm=llm, prompt=prompt)
    response = query_llm.run({"context": context, "question": question})        
    st.write(f"回答:{response}")

llm = Xinference(server_url=xinference_server_url, model_uid="qwen2.5_1.5B") 这行代码出错 报错信息: pydantic_core._pydantic_core.ValidationError: 1 validation error for Xinference client Field required [type=missing, input_value={'server_url': 'http://12...wargs': {'type': 'llm'}}, input_type=dict] For further information visit https://errors.pydantic.dev/2.7/v/missing

Expected behavior / 期待表现

代码就是 XInference 文档里提供的通过 langchain 调用实现文档聊天的例程代码,想知道这是什么原因以及如何解决

qinxuye commented 4 hours ago

可以尝试用 OpenAI 的方式连 Xinference,协议是兼容的。

winter-JX commented 4 hours ago

可以尝试用 OpenAI 的方式连 Xinference,协议是兼容的。

好的,我会试试。但是我又发现了一个问题 from langchain_community.llms import Xinference 从这行代码无法通过 IDE 右键Xinference这个单词跳转到对应的 py 文件 但是from langchain_community.llms import xinference 将其修改为小写就能跳转过去并且xinference这个单词也变绿了 跳转过去的文件如下图

image

但是改为小写之后再次运行 llm = xinference(server_url=xinference_server_url, model_uid="qwen2.5_1.5B") 这行会报错如下: TypeError: 'module' object is not callable 想知道这是为什么