Closed NouamaneTazi closed 10 months ago
Tested this works by running:
USE_FAST=1 CUDA_DEVICE_MAX_CONNECTIONS=1 torchrun --rdzv-backend=c10d --nproc_per_node=8 run_train.py --config-file examples/config_tiny_llama.yaml
look goods to me
Tested this works by running: