Open ppbrown opened 6 months ago
have you tried COSINE for onetrainer? the paper also says use Cosine Annealing and onetrainer's COSINE is equivalent (reference: https://github.com/Nerogar/OneTrainer/issues/214)
Also try increasing your d_coeficient to 2.0 or 3.0 which will let it jump to higher LRs as it searches for the right LR. Watch tensorboard to see where it's jumping to
I'm confused. I though the "COSINE" stuff, actually reset the LR and made for MORE varience. Which intuitively makes me think its for the opposite ?
"Cosine Annealing is a type of learning rate schedule that has the effect of starting with a large learning rate that is relatively rapidly decreased to a minimum value before being increased rapidly again"
Disclaimer: very new at all this. I'm coming from the perspective of an inference user, when on a good model using a nice prompt and sampler, there tends to be a number of steps past which, it will always converge to a nice stable image for a given seed.
I think I've read that there is some similar effect in (SDXL specifically) model training, where if you are doing things right, you will get "convergence" for training the resulting model. I take it to mean that I would see similar things if I viewed the per-epoch sample images; they would converge to something decent over the course of the epochs.
But... I'm not seeing that happening. For example, if I run a 100 epoch training over 80 images, I will see something reasonable congeal around maybe epoch 25.... and then it gets mushy for a while.. and then things come back into focus around epoch 80.
I tried turning on "safeguard_warmup" and "bias correction".. but the overall effect of those combined, seemed to be to just stretch out the training cycle. Now the things that happened at 80, happen at 2x80=160 epochs. (literally almost the same images)
Are my expectations off? Is it reasonable to believe that there ARE settings that will not only have a convergence, but will converge on something sane looking, given a good input dataset?
Im using onetrainer with the following settings at present: