xusenlinzy / api-for-open-llm

Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc. 开源大模型的统一后端接口
Apache License 2.0
2.33k stars 264 forks source link

“添加 TGI 生成接口转发“怎么用? #211

Closed foxxxx001 closed 8 months ago

foxxxx001 commented 9 months ago

提交前必须检查以下项目 | The following items must be checked before submission

问题类型 | Type of problem

启动命令 | Startup command

操作系统 | Operating system

None

详细描述问题 | Detailed description of the problem

# 请在此处粘贴运行代码(如没有可删除该代码块)
# Paste the runtime code here (delete the code block if you don't have it)

Dependencies

# 请在此处粘贴依赖情况
# Please paste the dependencies here

运行日志或截图 | Runtime logs or screenshots

# 请在此处粘贴运行日志
# Please paste the run log here
xusenlinzy commented 9 months ago
  1. 构建docker镜像
docker build -t llm-api:tgi -f docker/Dockerfile.tgi .
  1. 启动TGI模型服务
model=/data/checkpoints/SUS-Chat-34B

docker run --gpus=all --shm-size 10g -d -p 7891:80 \
    -v /data/checkpoints:/data/checkpoints \
    llm-api:tgi --model-id $model --trust-remote-code
  1. 转发成openai接口

docker-compose.yml文件参考如下:

version: '3.10'

services:
  apiserver:
    image: llm-api:tgi
    command: python api/server.py
    ulimits:
      stack: 67108864
      memlock: -1
    environment:
      - PORT=8000
      - MODEL_NAME=sus-chat
      - ENGINE=tgi
      - TGI_ENDPOINT=http://192.168.20.59:7891  # 第二步TGI启动的IP和端口
    volumes:
      - $PWD:/workspace
    env_file:
      - .env.example
    ports:
      - "7892:8000"
    restart: always
    networks:
      - apinet
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['0']
              capabilities: [gpu]

networks:
  apinet:
    driver: bridge
    name: apinet

最后启动转发服务

docker-compose up -d