“添加 TGI 生成接口转发“怎么用？

xusenlinzy / api-for-open-llm

Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc. 开源大模型的统一后端接口

Apache License 2.0

2.33k stars 264 forks source link

提交前必须检查以下项目 | The following items must be checked before submission

[X] 请确保使用的是仓库最新代码（git pull），一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
[X] 我已阅读项目文档和FAQ章节并且已在Issue中对问题进行了搜索，没有找到相似问题和解决方案 | I have searched the existing issues / discussions

问题类型 | Type of problem

启动命令 | Startup command

操作系统 | Operating system

None

详细描述问题 | Detailed description of the problem

# 请在此处粘贴运行代码（如没有可删除该代码块）
# Paste the runtime code here (delete the code block if you don't have it)

Dependencies

# 请在此处粘贴依赖情况
# Please paste the dependencies here

运行日志或截图 | Runtime logs or screenshots

# 请在此处粘贴运行日志
# Please paste the run log here

构建docker镜像

docker build -t llm-api:tgi -f docker/Dockerfile.tgi .

启动TGI模型服务

model=/data/checkpoints/SUS-Chat-34B

docker run --gpus=all --shm-size 10g -d -p 7891:80 \
    -v /data/checkpoints:/data/checkpoints \
    llm-api:tgi --model-id $model --trust-remote-code

转发成openai接口

docker-compose.yml文件参考如下：

version: '3.10'

services:
  apiserver:
    image: llm-api:tgi
    command: python api/server.py
    ulimits:
      stack: 67108864
      memlock: -1
    environment:
      - PORT=8000
      - MODEL_NAME=sus-chat
      - ENGINE=tgi
      - TGI_ENDPOINT=http://192.168.20.59:7891  # 第二步TGI启动的IP和端口
    volumes:
      - $PWD:/workspace
    env_file:
      - .env.example
    ports:
      - "7892:8000"
    restart: always
    networks:
      - apinet
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['0']
              capabilities: [gpu]

networks:
  apinet:
    driver: bridge
    name: apinet

最后启动转发服务

docker-compose up -d