-
### System Info
torch 2.0.1
torchaudio 2.0.2
torchvision 0.15.2
### Information
- [ ] The official example scripts
- [ ] My own…
-
when running `pretrain.py` with 1 or 4 GPUs and the DDPStrategy as described in the docs, I get the following error
```bash
"PyTorch/1.12.0-foss-2022a-CUDA-11.7.0/lib/python3.10/.../torch/distribu…
-
HOST安装的步骤
conda create -n llm python=3.11
conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index…
-
### Your current environment
```text
def destroy(self):
import gc
import torch
import ray
import contextlib
logger.info("vllm destroy")
def …
-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/OpenAccess-AI-Collective/axolotl/labels/bug) didn't find any similar reports.
…
-
### Your current environment
```text
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Amazon Linux 2 (x86_64)
GCC vers…
SamKG updated
2 weeks ago
-
### What happened + What you expected to happen
When vLLM distributes the work to run an api_server distributed on 2 machines to Ray, it is expected that Ray will execute the work successfully and ru…
-
I ran into a series of issues trying to get VLLM stood up on a system with multiple MI210s. I figured I'd document my issues and workarounds so that someone could pick up the baton later, or at least …
-
### Your current environment
```text
The output of `python collect_env.py`
```
### 🐛 Describe the bug
ERROR 07-01 08:12:10 async_llm_engine.py:52] Engine background task failed
ERROR 07-01 08:…
-
Hello,
I am looking at lance for a pytorch dataloader. I am having issues with a lance based loader (like this one https://lancedb.github.io/lance/examples/llm_training.html) when using it in a di…