[X] 1. I have searched related issues but cannot get the expected help.
[X] 2. The bug has not been fixed in the latest version.
[X] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Checklist
Describe the bug
使用2000w数据训练是正常的,但是使用5000w数据训练就会出现内存不足。 dataloader_num_workers设置为1也会报错。 目前单机内存是1600G。 请问有什么比较好的解决方法吗,internvl应该训练过更多数据吧?
Reproduction
sh 8b_full.sh
Environment
Error traceback
No response