Closed XilunWu closed 1 month ago
Stack from ghstack (oldest at bottom):
Test Plan unit test: torchrun --nproc_per_node=4 --rdzv_backend c10d --rdzv_endpoint="localhost:0" test_fused_rms_norm.py llama training test: CONFIG_FILE=./train_configs/debug_model.toml NGPU=4 LOG_RANK=0,1,2,3 ./run_llama_train.sh
torchrun --nproc_per_node=4 --rdzv_backend c10d --rdzv_endpoint="localhost:0" test_fused_rms_norm.py
CONFIG_FILE=./train_configs/debug_model.toml NGPU=4 LOG_RANK=0,1,2,3 ./run_llama_train.sh
Stack from ghstack (oldest at bottom):
Test Plan unit test:
torchrun --nproc_per_node=4 --rdzv_backend c10d --rdzv_endpoint="localhost:0" test_fused_rms_norm.py
llama training test:CONFIG_FILE=./train_configs/debug_model.toml NGPU=4 LOG_RANK=0,1,2,3 ./run_llama_train.sh