-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing ans…
-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing ans…
-
## 🐛 Bug
I am trying to work with Jiutian 13.9b MoE model.But getting error in model compilation step.
## To Reproduce
Steps to reproduce the behavior:
1.
pip install --pre -U -f https://…
-
### Your current environment
**报错:RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument weight in method…
-
请问跑minicpm-llama3-v-2_5(int4)支持并发调用接口么?2个及以上并发调用就报错了,单个没有问题。。。。
![微信截图_20240619180246](https://github.com/xorbitsai/inference/assets/167763677/caba7bf3-199d-4a24-88a3-b0e9833b50b2)
![微信截图_2024061918…
-
Hi, given that now RKNN-LLM supports embedding, I tried with different models , but no one works with embeddings.
Can you point me out a model that supports embedding and is compatible with RKLLM?
…
-
-
Error occurred when executing MiniCPM_VQA:
Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):
module 'triton' has no attribute '…
-
### Motivation
I found that the input token logprob is supported by Offline Inference Pipeline, as mentioned in [doc](https://lmdeploy.readthedocs.io/en/latest/inference/vl_pipeline.html#calculate-lo…
-
## Describe the bug
Hey everyone,
I’m trying to run vision models in Rust on my M4 Pro (48GB RAM). After some research, I found [Mistral.rs](https://github.com/EricLBuehler/mistral.rs/tree/master)…