Jetson 可不可以部署xinference？我在信息上发现xinference没有获取到GPU的信息，在对话的时候明显发现很慢，应该是只调用了cpu的算力

xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Apache License 2.0

5.27k stars 424 forks source link

System Info / 系統信息

Jetson AGX Orin 64GB jetpack 6.0

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

[ ] docker / docker
[ ] pip install / 通过 pip install 安装
[X] installation from source / 从源码安装

Version info / 版本信息

0.16.1

The command used to start Xinference / 用以启动 xinference 的命令

X_INFERENCE_HOME=/mnt/data/x_inference_data/data XINFERENCE_MODEL_SRC=modelscope xinference-local --host 127.0.0.1 --port 9997

Reproduction / 复现过程

微信图片_20241030204704

Expected behavior / 期待表现

希望调用gpu，提高对话速度

xorbitsai / inference