-
### Your current environment
```text
Collecting environment information...
PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A
OS: Ubuntu …
-
### Your current environment
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.6 LTS (x86_64)
GCC version: (U…
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
### Your current environment
```text
The output of `python collect_env.py`
Collecting environment information...
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: …
-
### Your current environment
2024-04-24 06:04:07 (27.2 MB/s) - ‘collect_env.py’ saved [24877/24877]
Collecting environment information...
PyTorch version: 2.2.1+cu121
Is debug build: False
CUDA…
-
### Your current environment
NUMA node(s): 2
NUMA node0 CPU(s): 0-19,40-59
NUMA node1 CPU(s): 20-39,60-79
Vulnerability Gather data sampling…
-
### Proposal to improve performance
I have observed that TTFT increases linearly with a total number of batched tokens.
For example, given 100k batch
- TTFT is around 2min when an average prompt…
-
### Your current environment
```text
PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A
OS: Ubuntu 24.04 LTS (x86_64)
GCC version: (Ubunt…
-
### Your current environment
```
root@cy-ah85026:/vllm-workspace# ray status
======== Autoscaler status: 2024-08-02 02:04:32.248220 ========
Node status
------------------------------------------…
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…