alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Apache License 2.0
723 stars 103 forks source link

llama3.1支持多数据集混合预训练 #347

Closed Bob199511 closed 2 months ago

Bob199511 commented 2 months ago

sh run_mcore_llama3_1.sh \ dsw \ 8B \ 1 \ 8 \ 1e-5 \ 1e-6 \ 128 \ 128 \ bf16 \ 4 \ 2 \ 1 \ true \ true \ true \ false \ false \ false \ 100000 \ /mnt/data/baohua.yin/data/llama3_cpt_data/data1,/mnt/data/baohua.yin/data/llama3_cpt_data/data2 \ /mnt/data/baohua.yin/data/llama3_cpt_data/data1,/mnt/data/baohua.yin/data/llama3_cpt_data/data2 \ /mnt/data/baohua.yin/temp/llama3-ckpts/mcore-tp4-pp2 \ 100000 \ 100 \ /mnt/data/baohua.yin/temp/llama3-ckpts/output 目前脚本中,我尝试传入多个数据集进行预训练会报错找不到对应的文件夹,请问目前支持多数据集预训练吗?

Bob199511 commented 2 months ago

已解决