Closed 2-dor closed 1 month ago
Definitely. Out of curiosity, would you mind sharing a reamp of something where you're seeing leaky ReLU do better?
Definitely. Out of curiosity, would you mind sharing a reamp of something where you're seeing leaky ReLU do better?
Sweet. Absolutely. Will supply a reamp and models trained with Tanh and LeakyRELU sometime later today.
Appreciate the open approach to it.
Definitely. Out of curiosity, would you mind sharing a reamp of something where you're seeing leaky ReLU do better?
I started with a blank slate today, after having installed v0.1.0 of the trainer.
For the Standard architecture, keeping all else the same, bar the activation function, Tanh vs LeakyReLU perform the same so I stand corrected.
I see a behavior in the runs using LeakyReLU to deviate more even towards the later part of a 1000 epoch session but the final ESR is still, more or less, the same.
Tried it with a lr_decay of 0.007 and 0.0045 and it's similar so I take that back - sorry for the red herring.
FWIW If LeakyReLU works just as well, it should be a more CPU-efficient layer, so I'd still like it :)
FWIW If LeakyReLU works just as well, it should be a more CPU-efficient layer, so I'd still like it :)
Nice :D Models I've trained yesterday on 0.10.0 were very close with the same reamp trained with Tanh:
I've trained a few more models with LeakyReLU and Tanh and the resulting ESR is pretty much at splitting hairs distance of each other. If LeakyReLU is indeed more CPU-efficient, I suspect a lot of folks would be happy to need less resources for the same profile / model quality.
The behavior I observe carries over to custom architectures too.
Link contains a reamp.wav file (I used the v3_0_0.wav for it) and the "lightning_logs" folder contents for the same architecture trained with Tanh and LeakyReLU.
https://mega.nz/file/Hk40lLLB#VOCaPdFBSJyTxZNYSdHD5WNu6dDTAJvpdC8DOnt2j24
Hey Steve, folks,
I've been trying some different things out with the trainer; I trained a few models with the LeakyRELU activation function and it seems the models converge to better ESRs .
However, when I try to load these in the NAM plugin, it crashes.
Could support for these types of models be added please? Would love to have a way of playing them back.