songquanpeng / one-api

OpenAI 接口管理 & 分发系统,支持 Azure、Anthropic Claude、Google PaLM 2 & Gemini、智谱 ChatGLM、百度文心一言、讯飞星火认知、阿里通义千问、360 智脑以及腾讯混元,可用于二次分发管理 key,仅单可执行文件,已打包好 Docker 镜像,一键部署,开箱即用. OpenAI key management & redistribution system, using a single API for all LLMs, and features an English UI.
https://openai.justsong.cn/
MIT License
18.22k stars 4.12k forks source link

从服务器+Redis未实现零数据库访问(API响应延迟过高) #364

Closed SunnyGPT closed 1 year ago

SunnyGPT commented 1 year ago

例行检查

问题描述 从服务器使用Redis+远程数据库后,API响应延迟过高,未实现预期的零数据库访问。

复现步骤

  1. 更新源、安装容器、安装与配置Redis

sudo apt -y && sudo apt install docker. -y && sudo gpass -a ${US} dock** sudo apt * && sudo apt install redis- && sudo systemctl redis- sudo systemctl enable redis- sudo nano /etc/redis/redis. requirepass ****

  1. 执行部署

docker run --name one-api -d --network="host" --restart always -e SESSION_SECRET= -e SQL_DSN="主服务器数据库用户名:密码@tcp(主服务器IP(远程/跨区):3306)/oneapi?tls=true" -e REDIS_CONN_STRING=redis://:@localhost:6379 -e SYNC_FREQUENCY=60 -e NODE_TYPE=slave -p 3000:3000 justsong/one-api

  1. 配置Nginx与证书 省略***

  2. 从服务器使用curl发起API请求(请求内容):

curl -X POST "https://api.***.com/v1/chat/completions" -H "Content-Type: application/json" -H "Authorization: Bearer sk-***" -d '{"model": "gpt-3.5-turbo-0613", "messages": [{"role": "user", "content": "hello?"}], "temperature": 0.7}' -w "Total time: %{time_total}\n"

  1. 响应内容(含时长)相同内容 第一次完全响应11.57秒,第二次响应6秒

{ "id": "chatcmpl-7kZ1JFFsAAmArb6s6Xey0c0TqP4nd", "object": "chat.completion", "created": 1691332061, "model": "gpt-3.5-turbo-0613", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! How can I assist you today?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 9, "completion_tokens": 9, "total_tokens": 18 } } Total time: 11.576355

  1. 容器日志
    
    [SYS] 2023/08/06 - 13:45:40 | One API v0.5.1.7 started 
    [SYS] 2023/08/06 - 13:45:40 | using MySQL as database 
    [SYS] 2023/08/06 - 13:45:41 | database connected 
    [SYS] 2023/08/06 - 13:45:41 | Redis is enabled 

2023/08/06 13:45:42 /build/model/option.go:18 SLOW SQL >= 200ms [406.589ms] [rows:26] SELECT * FROM options

2023/08/06 13:45:42 /build/model/cache.go:123 SLOW SQL >= 200ms [405.056ms] [rows:10] SELECT * FROM channels WHERE status = 1

2023/08/06 13:45:42 /build/model/cache.go:128 SLOW SQL >= 200ms [415.914ms] [rows:351] SELECT FROM abilities [SYS] 2023/08/06 - 13:45:42 | channels synced from database [SYS] 2023/08/06 - 13:46:42 | syncing options from database [SYS] 2023/08/06 - 13:46:42 | syncing channels from database 2023/08/06 14:10:24 /build/model/cache.go:30 SLOW SQL >= 200ms [404.628ms] [rows:1] SELECT FROM tokens WHERE key = 'u73uf8RT*****675dC7' ORDER BY tokens. id LIMIT 1

2023/08/06 14:10:24 /build/model/user.go:234 SLOW SQL >= 200ms [407.941ms] [rows:1] SELECT status FROM users WHERE id = 1

2023/08/06 14:10:24 /build/model/user.go:270 SLOW SQL >= 200ms [409.534ms] [rows:1] SELECT group FROM users WHERE id = 1

2023/08/06 14:10:25 /build/model/token.go:108 SLOW SQL >= 200ms [1019.196ms] [rows:1] UPDATE tokens SET status=1,accessed_time=1691331024 WHERE id = 66

2023/08/06 14:10:28 /build/model/user.go:255 SLOW SQL >= 200ms [403.304ms] [rows:1] SELECT quota FROM users WHERE id = 1

2023/08/06 14:10:29 /build/model/token.go:89 SLOW SQL >= 200ms [401.854ms] [rows:1] SELECT * FROM tokens WHERE id = 66 AND tokens.id = 66 ORDER BY tokens.id LIMIT 1

2023/08/06 14:10:30 /build/model/user.go:286 SLOW SQL >= 200ms [1019.564ms] [rows:1] UPDATE users SET quota=quota - 6 WHERE id = 1

2023/08/06 14:10:31 /build/model/token.go:147 SLOW SQL >= 200ms [1014.971ms] [rows:1] UPDATE tokens SET remain_quota=remain_quota - 6,used_quota=used_quota + 6 WHERE id = 66

2023/08/06 14:10:31 /build/model/user.go:255 SLOW SQL >= 200ms [202.865ms] [rows:1] SELECT quota FROM users WHERE id = 1

2023/08/06 14:10:32 /build/model/user.go:308 SLOW SQL >= 200ms [402.425ms] [rows:1] SELECT username FROM users WHERE id = 1

2023/08/06 14:10:33 /build/model/log.go:63 SLOW SQL >= 200ms [1019.869ms] [rows:1] INSERT INTO logs (user_id,created_at,type,content,username,token_name,model_name,quota, prompt_tokens,completion_tokens) VALUES (1,1691331032,2,'model rate 0.30, group rate 1.00','root','admin','gpt-3.5-turbo-0613 ',6,9,9)

2023/08/06 14:10:34 /build/model/user.go:296 SLOW SQL >= 200ms [1012.964ms] [rows:1] UPDATE users SET request_count=request_count + 1,used_quota=used_quota + 6 WHERE id = 1

2023/08/06 14:10:35 /build/model/channel.go:144 SLOW SQL >= 200ms [1013.993ms] [rows:1] UPDATE channels SET used_quota=used_quota + 6 WHERE id = 59 [GIN] 2023/08/06 - 14:10:35 | 200 | 11.536755359s | 3.26.73.247 | POST "/v1/chat/completions"

2023/08/06 14:10:39 /build/model/token.go:108 SLOW SQL >= 200ms [817.594ms] [rows:1] UPDATE tokens SET status=1,accessed_time=1691331038 WHERE id = 66

2023/08/06 14:10:39 /build/model/token.go:89 SLOW SQL >= 200ms [406.147ms] [rows:1] SELECT * FROM tokens WHERE id = 66 AND tokens.id = 66 ORDER BY tokens.id LIMIT 1

2023/08/06 14:10:40 /build/model/user.go:286 SLOW SQL >= 200ms [817.638ms] [rows:1] UPDATE users SET quota=quota - 6 WHERE id = 1

2023/08/06 14:10:41 /build/model/token.go:147 SLOW SQL >= 200ms [814.005ms] [rows:1] UPDATE tokens SET remain_quota=remain_quota - 6,used_quota=used_quota + 6 WHERE id = 66

2023/08/06 14:10:41 /build/model/user.go:255 SLOW SQL >= 200ms [404.617ms] [rows:1] SELECT quota FROM users WHERE id = 1

2023/08/06 14:10:42 /build/model/user.go:308 SLOW SQL >= 200ms [411.797ms] [rows:1] SELECT username FROM users WHERE id = 1

2023/08/06 14:10:43 /build/model/log.go:63 SLOW SQL >= 200ms [826.399ms] [rows:1] INSERT INTO logs (user_id,created_at,type,content,username,token_name,model_name,quota,p rompt_tokens,completion_tokens) VALUES (1,1691331042,2,'model rate 0.30, group rate 1.00','root','admin','gpt-3.5-turbo-0613' ,6,9,9)

2023/08/06 14:10:43 /build/model/user.go:296 SLOW SQL >= 200ms [816.332ms] [rows:1] UPDATE users SET request_count=request_count + 1,used_quota=used_quota + 6 WHERE id = 1

2023/08/06 14:10:44 /build/model/channel.go:144 SLOW SQL >= 200ms [820.787ms] [rows:1] UPDATE channels SET used_quota=used_quota + 6 WHERE id = 55 [GIN] 2023/08/06 - 14:10:44 | 200 | 6.020411356s | 3.26.73.247 | POST "/v1/chat/completions"



**预期结果**
实现零数据库访问,预期细节如下:
1、每隔1或n分钟执行一次数据库缓存(把log表之外的数据库数据全部缓存到redis,包括但不限于用户信息、渠道信息),实现零数据库访问
2、每隔1或n分钟执行一次数据库写入(用户的使用情况、配额变动等等缓存到redis,每隔1或n分钟往数据库里写1次)
3、当用户登陆前端查看log时,从数据库中读取

总结:当前是用户请求之后才会缓存用户的key,配额等信息似乎没有缓存,导致响应时长过长。另外除了log占用数据库,其他表都不占用数据库空间,逻辑上可以全部缓存到从服务器Redis,实现真正意义上的零数据库访问。

**相关截图【从服务器+Redis+主服务器数据库测试与主服务器+服务器本机数据库测试对比】**
![image](https://github.com/songquanpeng/one-api/assets/126902077/10431fba-7270-4830-b2e9-64f57f99d497)
![image](https://github.com/songquanpeng/one-api/assets/126902077/25c2e303-686d-4489-8641-6dcc27e12522)
songquanpeng commented 1 year ago

补充一下从服务器直接访问 OpenAI 的服务器的数据(多次测试取均值),作为基准数据用以参考

SunnyGPT commented 1 year ago

补充一下从服务器直接访问 OpenAI 的服务器的数据(多次测试取均值),作为基准数据用以参考

这是从服务器直接访问OpenAI的多次测试结果。 image

songquanpeng commented 1 year ago

非 stream 模式下显式 flush 以在扣费前返回响应体,stream 模式下应没有影响。