-
### Is there an existing issue for this?
- [X] I have searched the existing issues and did not find a match.
### Who can help?
_No response_
### What are you working on?
I am attempting to fine-t…
-
baichuan-13b-chat用vllm来生成,很多测试数据(有长有短,没有超出长度限制)只能生成一个句号,而且有些示例在删掉一些字词或句子之后,就可以正常生成了,请问有可能是什么原因?
import torch
from vllm import LLM, SamplingParams
sampling_params = SamplingParams(temperature=0,…
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.1.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
Hello,
I have followed along with the advanced training case study, and I believe I was successful in training a model (at least, there were no errors thrown in that step that I could see). I am u…
-
I set up this model and I am running it as server using vllm.
Here is the command.
`python -m vllm.entrypoints.openai.api_server --model "/root/autodl-tmp/kdy/models/ALMA-13B-R" --served-m…
-
Hey folks!
I'm working on optimizing a deployment of `whisper-large-v3` by moving to `float16`/`bfloat16` instead of the default `float32`. The problem is that the cross-attention/cross-atention ca…
-
I wanted to convert 2 models for usage in inf1, the movenet model and another model which was saved from keras as a .h5 file. The movenet model is a tensorflow model (if I am not mistaken), saved in a…
-
In tt-metal, all tensors must currently be aligned to a 4D shape.
For example, if the original shape of the tensor is (4, 8, 32), it should be transformed into (1, 4, 8, 32) to be compatible with tt-…
-
Once you implement a db that saves the URL and caption, you can easily add a full-text search to render all the images that contain a text input. For simplicity, use SQLite. 🔍
You can go a step f…
-
With 0.3.0 release, not on 0.2.7. cuda 12.1 using V100.
```
➜ ~ k logs -f h2ogpt-vllm-inference-764dfd798c-rlmjd -n h2ogpt
INFO 02-02 01:01:59 api_server.py:209] args: Namespace(host='0.0.0.0',…