-
The script to reproduce the bug.
```python
import os
import time
import pickle
import torch
import threading
import torch.distributed as dist
import torch.distributed.distributed_c10d as c10…
-
### 🐛 Describe the bug
I found the example that specifying a CUDA stream using a with-context does not work as expected.
```python
with torch.cuda.stream(my_stream):
dist.all_reduc…
-
**Please describe the bug**
**Please describe the expected behavior**
**System information and environment**
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04, docker):Linux Ubuntu 18.04…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTor…
-
While trying to build with PyTorch, I am getting a CMake error.
`CMake Error at /opt/conda/envs/fastertransformer/lib/python3.8/site-packages/cmake/data/share/cmake-3.26/Modules/FindPackageHandleSt…
-
想知道这个微调大概需要多少显存呢,我用了6张4090,但还是爆显存了,能帮忙看下问题吗。我的运行脚本长以下这样:
data_path='./data_files'
model_name_or_path='/data2/hugo/lin_rany/model/Meta-Llama-3-8B-Instruct'
export NCCL_P2P_DISABLE=1
export NCCL_IB_…
-
I am following the [getting started](https://epfllm.github.io/Megatron-LLM/guide/getting_started.html) guide with mistal-7B model.
- I am able to (1) convert `mistralai/Mistral-7B-v0.1` and (2) …
-
![image](https://github.com/NVIDIA/nccl-tests/assets/79137028/46df9e5a-fc8c-4a7e-9dee-425de5b60165)
when i run nccl-test with sharp, i meet the error, what cause this
I tested using the NGC 24.05 ve…
-
**Some software versions:**
nccl test : 2.13.9
openmpi: 4.1.5
rdma ofed: 23.10-1.1.9.0
nvidia-dirver: 535.104.12-1
cuda: 11.4.4-1
nccl: 2.21.5-1
**Command**
mpirun --allow-run-as-root -…
-
- my transformer is 4.21.2 and my deepspeed is 0.8.1. And I run the code [bloom-ds-zero-inference.py](https://github.com/huggingface/transformers-bloom-inference/blob/main/bloom-inference-scripts/blo…