-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
### Your current environment
```
(vllm-gptq) root@k8s-master01:/workspace/home/lich/QuIP-for-all# pip3 list | grep aphrodite
aphrodite-engine 0.5.3 /workspace/home/lich/aphrodite-eng…
-
### 🚀 The feature, motivation and pitch
Last week AMD announced rocm 6.2 (https://rocm.docs.amd.com/en/latest/about/release-notes.html) also announcing expanded support for VLLM & FP8.
Actuall…
-
为啥推理一条需要接近1分钟,8*80G A100
```
import os
from vllm import LLM, SamplingParams
import torch
from constants_prompt import build_autoj_input
from zh_constants_prompt import zh_build_autoj_input
impo…
-
a series of text-conditioned Diffusion Transformers (DiT) capable of transforming textual descriptions into vivid images, dynamic videos, detailed multi-view 3D images, and synthesized speech.
Code…
-
When I run llama 3.1 example with runpd I'm getting this error:
h/sky-key' root@69.30.85.136 -p 22035 -o StrictHostKeyChecking=no -o PasswordAuthentication
=no -o ConnectTimeout=10s -o UserKn…
-
### 🚀 The feature, motivation and pitch
# Background
Currently, the project supports various hardware accelerators such as GPUs, but there is no support for NPUs. Adding NPU support could signific…
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
The code im running:
```
lm = dspy.HFClientVLLM(model="NurtureAI/Meta-Llama-3-8B-Instruct-32k", port=38242, url="http://localhost", max_tokens=4)
test_text = "This is a test article. abc"
output_n…
-
### Your current environment
PyTorch version: 2.1.2+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.4 LTS (x86_64)
GCC version: (U…