nccl Search Results - Githubissues

1000+ results
for nccl

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/nccl #413

Regarding the AllGather bandwidth with different byte alignm…

I'm running into cases where I have to allgather extra bytes(say 4 bytes), which make the data not perfectly 32byte or 64byte aligned. While doing this, substantial performance degradation was observe…

nachtsky1077 updated 2 years ago
5
NVIDIA/nccl #501

Multiple ncclRecv within ncclGroupStart/ncclGroupEnd seems t…

I was debugging the following issue in PyTorch with regards to nccl send/recv: https://github.com/pytorch/pytorch/issues/50092. I tried to see if I could somehow reproduce the issue in NCCL itself to …

pritamdamania87 updated 3 years ago
3
rioyokotalab/nccl-reader #3

[fujii]読み進めメモ 2024

[Llama-3論文](https://arxiv.org/abs/2407.21783)の3.3.3 Collective Communication、3.3.4 Reliability and Operational Challenges における、NCCLXに類似する機能を作りたいモチベーション

okoge-kaz updated 2 months ago
5
huggingface/diffusers #9501

Dreambooth Flux training does not save a model for around 10…

### Describe the bug This time i set amount of steps to 2 to make sure it correctly saves the model after an hour of training. But it does not. ### Reproduction Run `accelerate config` ``` comp…

kopyl updated 10 hours ago
9
NVIDIA/nccl #597

why Mellnox Net card band width will be affected by "-x HCOL…

There was a strange issue When we using A100 do benchmark testing. The command as follow: mpirun -np 16 -H rdma1:8,rdma2:8 --allow-run-as-root -bind-to none -map-by slot -x NCCL_DEBUG=INFO -x NCCL_…

guoyaowen30 updated 2 years ago
1
microsoft/DeepSpeed #3489

ModuleNotFoundError with Multi-node training using SLURM

I am trying to train models on multiple nodes with SLURM as a workload manager. The Issue seems to be with the Python virtual environment not available to all nodes. Please find more details below. …

macabdul9 updated 1 year ago
3
NVIDIA/nccl-tests #124

ArchLinux test Failed

Hi, I use ArchLinux with dual GPUs and connected with NVLink. I install the `cuda` and `nccl` from the community repo. ``` cuda 11.8.0-1 nccl 2.15.5-1 ``` I use the following command ` CUDA_…

jacklu333333 updated 1 year ago
3
NVIDIA/nccl #1360

Single or double ring

Hello, After few experiments it seems that NCCL uses a double ring topology for data transfer. Is double ring the default? Or is there an option to change to single ring topology? I am investigatin…

CatalinLucian updated 4 months ago
1
NVIDIA/nccl #1084

Multi-NIC performance on platform without PCIe switch

Hi, developer: I run nccl-tests (all_reduce_perf) with 8 GPUs (NVIDIA A30) and 2 NICs (100G) between 2 same GPU server (PCIe4.0). The topology of the GPU server is as follows ```shell GPU0 GPU…

baolibaoqi updated 9 months ago
3
peteanderson80/bottom-up-attention #34

cannot train locally with the error "AttributeError: type ob…

File "./tools/train_net_multi_gpu.py", line 109, in max_iter=args.max_iters, gpus=gpus) File "/home/jzheng/PycharmProjects/bottom-up-attention/tools/../lib/fast_rcnn/train_multi_gpu.py", li…

SkylerZheng updated 4 years ago
10

上一页 1...93 94 95 96 97 98 99...100 下一页

1000+ results for nccl

1000+ results
for nccl