eps696 / aphantasia

CLIP + FFT/DWT/RGB = text to image/video
MIT License
772 stars 105 forks source link

[Feature] Learning Rate Modified by Steps #2

Closed torridgristle closed 3 years ago

torridgristle commented 3 years ago

I've experimented with a learning rate that changes as the steps increase due to seeing Aphantasia develop a general image very quickly, but then slowing down to make small details. I believe that my proposed alternative puts more focus on larger shapes, and less on details.

I expose the learning_rate variable and add a learning_rate_max variable in the Generate cell, remove the optimizer = torch.optim.Adam(params, learning_rate) line and instead add this to def train(i):

learning_rate_new = learning_rate + (i / steps) * (learning_rate_max - learning_rate)
optimizer_new = torch.optim.Adam(params, learning_rate_new)

With this, I find that a learning_rate of 0.0001 and a learning_rate_max of 0.008 at the highest value works well, for 300-400 steps and about 50 samples at least.

eps696 commented 3 years ago

thanks for proposal, will check that! i did notice fancy behaviour of training details, but haven't performed so thorough tests. the process for single image looked ok for me as is, but i had tough time struggling with multi-phrase continuous runs - the imagery tended to get stuck after 7-10 steps. meanwhile i resorted to weird tricks of mixed initialization + further interpolation; hopefully your approach may help there (since optimizer and params are recreated every cycle anyway).

i'll keep this open till further exploration.

eps696 commented 3 years ago

@torridgristle i gave it a try, here is my findings:

i will add and expose progressive mode as an option (and change default lrate) anyway, to encourage further experiments.

please note also that: 1) learning rate of the optimizer can be changed on the fly without recreation, as following:

for g in optimizer.param_groups: 
    g['lr'] = lr_init + (i / steps) * (lr_max - lr_init)

2) the generation is proceeded by patches/samples of random size (and position), and it's their size that directly affects size of painted features. so it may be worth changing that size progressively in slice_imgs function (@jonathanfly proposed that while ago, but i didn't dig into that).

eps696 commented 3 years ago

further tests have shown that in most cases progressive lrate does have some impact on the composition. i would not call it "larger shapes enhancements" (sometimes it just drew significantly less elements of all kinds), but it's worth having in the options. exact lrate values depend heavily on the model, prompts and (especially) the resolution: even 0.03-0.05 was not enough to cover the whole 4k frame in some cases (training up to 1000 steps and 256 samples). i also tested rates as big as 1~10, and they also have their interesting specifics (not explicitly generalizable as a common rule).

eps696 commented 3 years ago

@torridgristle implemented and mentioned on readme. thanks for discovering; closing now.