InternLM / InternEvo

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
https://internevo.readthedocs.io/zh-cn/latest/?badge=latest
Apache License 2.0
310 stars 52 forks source link

[QA] loong train 支持packed_sample_into_one=false吗 #346

Open Lzhang-hub opened 1 month ago

Lzhang-hub commented 1 month ago

描述问题

咨询一下,长文本训练支持样本间的相互隔离吗?

image
mwiacx commented 2 weeks ago

支持,internevo默认配置基本上都是 use_packed_data = True, pack_sample_into_one = False;不过loongtrain我们目前只支持unpack data,由于2d attn 依赖的zigzag attn那边暂时只适配了unpack的版本