labring / FastGPT

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.
https://fastgpt.in
Other
15.23k stars 3.97k forks source link

rerank: app.py运行卡住,gpu利用率100%,一直卡住不输出结果 #1879

Closed xiaoToby closed 23 hours ago

xiaoToby commented 1 week ago

例行检查

你的版本 v4.8

问题描述, 日志截图 首先我使用registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-v2-m3:v0.1镜像启用模型接入Fastgpt config.json: image docker-compose.yml: image fastgpt报错信息: image

后续我重新启用一个容器,将模型文件和registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-v2-m3:v0.1中的app.py文件导入在容器中 app.py:

from FlagEmbedding import FlagReranker
from pydantic import Field, BaseModel, validator
from typing import Optional, List

app = FastAPI()
security = HTTPBearer()
env_bearer_token = 'ACCESS_TOKEN'

class QADocs(BaseModel):
    query: Optional[str]
    documents: Optional[List[str]]

class Singleton(type):
    def __call__(cls, *args, **kwargs):
        if not hasattr(cls, '_instance'):
            cls._instance = super().__call__(*args, **kwargs)
        return cls._instance

RERANK_MODEL_PATH = os.path.join(os.path.dirname(__file__), "bge-reranker-v2-m3")

class ReRanker(metaclass=Singleton):
    def __init__(self, model_path):
        self.reranker = FlagReranker(model_path, use_fp16=False)

    def compute_score(self, pairs: List[List[str]]):
        if len(pairs) > 0:
            result = self.reranker.compute_score(pairs, normalize=True)
            if isinstance(result, float):
                result = [result]
            return result
        else:
            return None

class Chat(object):
    def __init__(self, rerank_model_path: str = RERANK_MODEL_PATH):
        self.reranker = ReRanker(rerank_model_path)

    def fit_query_answer_rerank(self, query_docs: QADocs) -> List:
        if query_docs is None or len(query_docs.documents) == 0:
            return []

        pair = [[query_docs.query, doc] for doc in query_docs.documents]
        scores = self.reranker.compute_score(pair)

        new_docs = []
        for index, score in enumerate(scores):
            new_docs.append({"index": index, "text": query_docs.documents[index], "score": score})
        results = [{"index": documents["index"], "relevance_score": documents["score"]} for documents in list(sorted(new_docs, key=lambda x: x["score"], reverse=True))]
        return results

@app.post('/v1/rerank')
async def handle_post_request(docs: QADocs, credentials: HTTPAuthorizationCredentials = Security(security)):
    token = credentials.credentials
    if env_bearer_token is not None and token != env_bearer_token:
        raise HTTPException(status_code=401, detail="Invalid token")
    chat = Chat()
    try:
        results = chat.fit_query_answer_rerank(docs)
        return {"results": results}
    except Exception as e:
        print(f"报错:\n{e}")
        return {"error": "重排出错"}

if __name__ == "__main__":
    token = os.getenv("ACCESS_TOKEN")
    if token is not None:
        env_bearer_token = token
    try:
        uvicorn.run(app, host='0.0.0.0', port=7013)
    except Exception as e:
        print(f"API启动失败!\n报错:\n{e}")

测试文件 test.py:

import requests

url = f"http://localhost:7013/v1/rerank"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer sk-tidukjjinarerank"
}

data = {
    "model": "bge-reranker-v2-m3",
    "query": "Organic skincare products for sensitive skin",
    "documents": [
        "Eco-friendly kitchenware for modern homes",
        "Biodegradable cleaning supplies for eco-conscious consumers",
        "Organic cotton baby clothes for sensitive skin",
        "Natural organic skincare range for sensitive skin",
        "Tech gadgets for smart homes: 2024 edition",
        "Sustainable gardening tools and compost solutions",
        "Sensitive skin-friendly facial cleansers and toners",
        "Organic food wraps and storage solutions",
        "All-natural pet food for dogs with allergies",
        "Yoga mats made from recycled materials"
    ],
    "top_n": 3
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

效果如下: image image image

@c121914yu @nongmo677 @lijiajun1997 复现步骤

预期结果

相关截图

xiaoToby commented 1 week ago

https://github.com/labring/FastGPT/issues/1111#issuecomment-2033454165

更换了镜像,在容器内替换了原来的app.py文件 得到的效果一样,没有输出,并且三张gpu卡的利用率100% @c121914yu

Essence9999 commented 1 week ago

image 修改app.py中文件路径 docker build构建镜像 docker run创建容器 oneapi配置 config配置参照官方 docker compose配置 启动 docker logs reranker看看服务有没有正常启动

xiaoToby commented 1 week ago

image 修改app.py中文件路径 docker build构建镜像 docker run创建容器 oneapi配置 config配置参照官方 docker compose配置 启动 docker logs reranker看看服务有没有正常启动

image

xiaoToby commented 6 days ago

解决了 我这边的解决方案是在docker-compose文件中加入环境变量CUDA_VISIBLE_DEVICES image