-
Does this support distributed training (e.g., DDP/FSDP)? Thanks for sharing!
-
@WongKinYiu mentioned that help is requested in building out DDP-compatible training infrastructure. Let's start that discussion here.
-
Hi!
As recently discussed in #145 and #144 with @Xiaoming-Zhao (and I as had already mentioned in https://github.com/atong01/conditional-flow-matching/pull/116#discussion_r1695722539), I/we believe…
-
### Search before asking
- [X] I have searched the YOLOv5 [issues](https://github.com/ultralytics/yolov5/issues) and [discussions](https://github.com/ultralytics/yolov5/discussions) and found no simi…
-
Hi, I am trying to Finetune HMR2.0 on 3 RTX3090. It works fine when I only train it on 1 gpu.
When I set trainer.devices=3, I received the error:
ValueError: ctypes objects containing pointers can…
-
### System Info
```shell
AWS EC2 instance: trn1.32xlarge
OS: Ubuntu 22.04.4 LTS
Platform:
- Platform: Linux-6.5.0-1023-aws-x86_64-with-glibc2.35
- Python version: 3.10.12
Python packages:
…
-
Hey,
Is DDP training and nn.SyncBatchNorm.convert_sync_batchnorm supported ?
-
Hi,
Does it support DDP training?
-
Hi community,
I have been stuck on this issue for some time now and would greatly appreciate any help! I am trying to run the optimise_hyperparameter function over 2 A100GPU using PyTorch DDP strat…
-
Investigate ways to bring GPU utilization to as close as 100% as possible and maximize model throughput. Focus on multi-GPU on a single node.
Collecting some questions from me and @benczaja -- fee…