[unet3D]: Rounding up epochs with mixed-batches and learning_rate schedules

There are 168 images in Unet3D dataset, which is lower than the other benchmarks. Based on the rules here, if we were to use batch size of 128, we can use mixed-batch approach and merge the images from 2 epochs in a single step.

This makes it a bit complicated to satisfy math equivalence regarding learning_rate_schedules. The closest math equivalent way would be scaling step numbers by (256 / 168) to match reference_epoch number. This still has differing learning_rate applications to partial epochs as in other models, but the difference might be more visible due to the smaller size of dataset.

So, I wanted to be sure about that this still satisfies math equivalence under the current rules, is that right?

mlcommons / training_policies

[unet3D]: Rounding up epochs with mixed-batches and learning_rate schedules #427