changes the default settings to apply standard batching
Limitations:
I ran into a problem at the end while re-loading the sharded FLAX model. Will work fine if the model is small enough to be saved as a single file. The predictions are saved in FLAX format regardless
This still seems to take a large amount of CPU memory, in particular there was a spike at the end of the runtime (I assume once inference is done and the model is being saved).
Limitations:
5