Batch size for prior training

Hi, Great work with the Kandinsky model, the last improvements look really impressive 🎨

For prior training/tuning I saw that the default batch size is 1, is that actually the size used during training, or is a larger batch needed for stable training? Would it be possible to share the configuration used for training the prior from scratch (the one that took 1M iterations)

ai-forever / Kandinsky-2

Batch size for prior training #90