Closed wuxibin89 closed 3 months ago
len(prompts_dataloader) == prompts // micro_rollout_batch_size
When micro_rollout_batch_size and micro_train_batch_size are not equal, the num_update_steps_per_episodes is not correct and will cause cosine learning rate scheduler not work as expected.
When micro_rollout_batch_size and micro_train_batch_size are not equal, the num_update_steps_per_episodes is not correct and will cause cosine learning rate scheduler not work as expected.