-
### System Info / 系統信息
cuda :12.2
python:3.10.14
OS:centos 7.9
Package Version
### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
- [ ] docker / docker
- [X] pip install / 通过 pip i…
-
Environment is cuda12.2 pytorch 2.3.1
Install unsloth using :
pip install "unsloth[cu121-ampere-torch230] @ git+https://github.com/unslothai/unsloth.git"
Installation of unsloth succeeded!
T…
-
### Describe the bug
https://inference.readthedocs.io/zh-cn/latest/getting_started/using_xinference.html
文档错误
from xinference.client import RESTfulClient
client = RESTfulClient("http://127.0.0.1…
-
### 🚀 The feature, motivation and pitch
We have a deployment of Llama3.1-8B-Instruct and Llama3.1-70B-Instruct models through vLLM hosted in our OnPremise GPU infra.
While testing different use-ca…
-
### Bug Description
We are getting following error when we use dense_x with elasticsearch especially when using 70+ pages :
```
raise self._make_status_error_from_response(err.response) from…
-
Can i use ollama python package to interact with lavague?
-
model_name: llama-2-7b-chat
[load_smoothquant_model] model loaded ...
modules.json: 100%|███████████████████████████████████████████████████████████████████████████| 349/349 [00:00
-
### Error Description
I am encountering the error, `Native API returns: -30 (PI_ERROR_INVALID_VALUE)`, when trying to run llama.cpp with the latest IPEX-LLM, following the official quickstart guide o…
-
### Bug Description
Hi there,
I want to save a vectorstoreindex to a chromaDB (thats what happens in the initialization script) and then - in another script - i want to read the vectorstoreindex f…
-
# Setup Environment
Firstly, make sure that everything works well in `https://github.com/microsoft/Megatron-DeepSpeed/tree/main/examples_deepspeed/finetune_hf_llama`. This make sure that you have sol…