Open fyting opened 2 months ago
Yes, you can follow this PR (https://github.com/OpenGVLab/InternVL/pull/506/files#diff-a6d78bf1713c7a9e7c1c701008ac8761ecf7d9d376f56658522ad6a2bda77016), for 6 + 20b training, we can reduce training time from 14.5h to 9.5h with 64 GPUs using vit 9 llm 4096 input. @fyting
Yes, you can follow this PR (https://github.com/OpenGVLab/InternVL/pull/506/files#diff-a6d78bf1713c7a9e7c1c701008ac8761ecf7d9d376f56658522ad6a2bda77016), for 6 + 20b training, we can reduce training time from 14.5h to 9.5h with 64 GPUs using vit 9 llm 4096 input. @fyting
Thank you for your guidance. Could you please also provide the sh script used for training?
use_fast_dataset=True
After setting use_fast_dataset=True in the config, the training process gets stuck at this point. What could be the issue?
Maybe you can try to insert some breakpoints (pdb) to solve your problem @fyting.
Is the way to use OmniBal in the internvl codebase by adding the use_fast_dataset=True configuration in the bash script? For example, if you add the use_fast_dataset=True configuration in this file: https://github.com/ModelTC/InternVL/blob/OmniBal_V2.0/internvl_chat/shell/internvl1.5/hermes2_yi34b/internvl_chat_v1_5_hermes2_yi34b_dynamic_res_finetune.sh, will it accelerate training?