OpenBMB / MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.
Apache License 2.0
4.67k stars 334 forks source link

[Feature Request]: MiniCPM 训练的batch_size是4M,怎么解释 #50

Closed sunshineflg closed 6 months ago

sunshineflg commented 6 months ago

Feature request / 功能建议

企业微信截图_20240205180051 模型训练时使用的batch size是4M,这个4M是400万吗?一个batch,用400万example?

THUCSTHanxu13 commented 6 months ago

这里的batch是以token来计数,不是example数量。

zizhec commented 6 months ago

那每个sample是多少token?

nmd2k commented 6 months ago

Thanks for clarify 🤣