SDAIer commented 1 month ago

System Info / 系統信息

NVIDIA-SMI 535.183.06 Driver Version: 535.183.06 CUDA Version: 12.2

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

[X] docker / docker
[ ] pip install / 通过 pip install 安装
[ ] installation from source / 从源码安装

Version info / 版本信息

0.15.3

The command used to start Xinference / 用以启动 xinference 的命令

dock run

Reproduction / 复现过程

fastgpt 对接xinference, fastgpt 配置文件config.json中 "maxContext":和axResponse"设置的参数是否能传到xinference？

xf下有没有好一点的支持长文本分析的本地模型，例如分析合同、财报等，测试了好几个要么提示gpu资源不足（事实上gpu有资源）要么提示超过了token，但是xf提示的token好像是一个默认的数值，不是fastgpt config.json设置的maxContext

Expected behavior / 期待表现

ok

Valdanitooooo commented 1 month ago

fastgpt 作为 xinference API 的消费者不需要传递 xinference 的启动参数 max_model_len 把 xinference API 当成 openai 的 API 来用就行了没用过 fastgpt，maxContext指的是请求 API 的 max_tokens吗

SDAIer commented 1 month ago

maxContent是上下文，这个参数xf默认较小，导致无法处理长文本。我用fastgpt—oneapi—xf本地模型，如何处理长文本这个参数值。

具体下面的问题里有描述，麻烦看一下，多谢

xorbitsai / inference

fastgpt 对接xinference关于 "maxContext":和axResponse"的问题 #2370