In the paper InternLM-XComposer2-4KHD, I find there is a description: "we adjust the data loader to enable different batch sizes for them and adjust their weight accordingly."
How to achieve this during trainining? Looking forward to your replay. Thanks!
In the paper InternLM-XComposer2-4KHD, I find there is a description: "we adjust the data loader to enable different batch sizes for them and adjust their weight accordingly."
How to achieve this during trainining? Looking forward to your replay. Thanks!