Remove Apex dependency - Githubissues

huggingface / nanotron

Minimalistic large language model 3D-parallelism training

Apache License 2.0

1.23k stars 122 forks source link

Closed NouamaneTazi closed 10 months ago

NouamaneTazi commented 10 months ago

Tested this works by running:

USE_FAST=1 CUDA_DEVICE_MAX_CONNECTIONS=1 torchrun --rdzv-backend=c10d --nproc_per_node=8 run_train.py --config-file examples/config_tiny_llama.yaml

3outeille commented 10 months ago

look goods to me