-
multi image inference for "OpenGVLab/InternVL2-8B" not working
I got this inference code from here https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_vision_language_multi_…
-
### Your current environment
**environment:**
vllm 0.4.2
python3.10
cuda11.8
cpu: 52
mem: 375Gi
**model:**
llama3-70B
### 🐛 Describe the bug
**description**:
vLLM engine init failed, w…
-
### Your current environment
```Collecting environment information...
PyTorch version: 2.3.0
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 2…
-
Model name: Lenovo Legion Slim 5 16APH8
CPU model: AMD Ryzen 7 7840HS
GPU model: NVIDIA RTX 4060 Mobile
Keyboard backlight: RGB
OS: Archlinux
Output of `sudo dmidecode -t system`. Please remove…
-
### System Info
The 4.38.2 version breaks code using custom 4d attention masks (introduced in #27539). Apparently, the custom masks gets replaced here:https://github.com/huggingface/transformers/bl…
-
OS Version:`Ubuntu 22.04 Docker`
Spring Version: `v1.0.1`
Snapshot Version: `v8`
Download Snapshot:
```bash
wget -O /tmp/snapshot-v8-latest.bin https://snapshots.eosnation.io/eos-v8/latest
```…
-
### What happened?
Cannot Boot the image. Its not related to NVME or SDCARD boot but missing `.dtb` file
Process followed: [official](http://www.orangepi.org/orangepiwiki/index.php?title=Orange_Pi_3…
-
### Describe the bug
I recently tried using openllm to connect to llama and it would give me some bentoml config errors. I'm not sure if its because I don't have a GPU but I didn't see any evidence o…
-
Hi, I am wondering about the training process of the small model and the verification accuracy. As it has large effects on the decoding effectiveness. Thank you!
-
运行代码
```python
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
import torch
model_path2 = "/home/xxx/llm/Qwen1.5-32B-Chat-GPTQ-Int4"
# Initialize the tokenizer
tok…