-
### Your current environment
The output of `python collect_env.py`
```text
Vllm version : 0.5.5
Nccl=2.20.5
Gpu : Telsa V100-sxm2-32GB
Cuda version : 12.6
Driver version : 560.28.03
`…
-
### 🚀 The feature, motivation and pitch
**Overview**
The goal of this RFC is to discuss the integration of distributed inference into TorchChat. Distributed inference leverages tensor parallelism …
-
I wonder if I might be doing something wrong, but it appears that all of my model evaluations are occurring in serial when I believe they should be occurring in parallel.
If I define my `allocation…
-
### Your current environment
Name: vllm
Version: 0.6.3.post2.dev171+g890ca360
### Model Input Dumps
_No response_
### 🐛 Describe the bug
I used the interface from this vllm repository …
-
Bedrock SDK client cannot take in tool_choice param for `disable_parallel_tool_use`. I am on latest anthropic sdk 0.32.1 and bedrock package 0.11.2.
### Current Behavior:
Supplying the field retur…
-
## Background
*relevant information and motivation for this task*
See #159.
See https://aaltoscicomp.github.io/python-for-scicomp/parallel/ and let us know if it's useful.
## Task
Compute…
-
### Your current environment
The output of `python collect_env.py`
```text
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A…
-
### System Info
Sagemaker Docker images:
```shell
763104351884.dkr.ecr.us-east-2.amazonaws.com/huggingface-pytorch-training-neuronx:1.13.1-transformers4.36.2-neuronx-py310-sdk2.18.0-ubuntu20.04
…
-
### Model Series
Qwen2.5
### What are the models used?
Qwen2.5-72B-Instruction
### What is the scenario where the problem happened?
vllm
### Is this a known issue?
- [X] I have fo…
-
Hi akash-aky, First of all, thank you for creating `Exile`, it's a very amazing library! I recently ran into some problems using it to execute `parallel` this application, here are my debug result:
`…