[FEATURE] Add support for models trained with the LeakyRELU activation function

sdatkinson / NeuralAmpModelerCore

Core DSP library for NAM plugins

MIT License

304 stars 61 forks source link

[FEATURE] Add support for models trained with the LeakyRELU activation function #114

Closed 2-dor closed 1 month ago

2-dor commented 1 month ago

Hey Steve, folks,

I've been trying some different things out with the trainer; I trained a few models with the LeakyRELU activation function and it seems the models converge to better ESRs .

However, when I try to load these in the NAM plugin, it crashes.

Could support for these types of models be added please? Would love to have a way of playing them back.

sdatkinson commented 1 month ago

Definitely. Out of curiosity, would you mind sharing a reamp of something where you're seeing leaky ReLU do better?

2-dor commented 1 month ago

Definitely. Out of curiosity, would you mind sharing a reamp of something where you're seeing leaky ReLU do better?

Sweet. Absolutely. Will supply a reamp and models trained with Tanh and LeakyRELU sometime later today.

Appreciate the open approach to it.

2-dor commented 1 month ago

Definitely. Out of curiosity, would you mind sharing a reamp of something where you're seeing leaky ReLU do better?

I started with a blank slate today, after having installed v0.1.0 of the trainer.

For the Standard architecture, keeping all else the same, bar the activation function, Tanh vs LeakyReLU perform the same so I stand corrected.

I see a behavior in the runs using LeakyReLU to deviate more even towards the later part of a 1000 epoch session but the final ESR is still, more or less, the same.

Tried it with a lr_decay of 0.007 and 0.0045 and it's similar so I take that back - sorry for the red herring.

sdatkinson commented 1 month ago

FWIW If LeakyReLU works just as well, it should be a more CPU-efficient layer, so I'd still like it :)

2-dor commented 1 month ago

FWIW If LeakyReLU works just as well, it should be a more CPU-efficient layer, so I'd still like it :)

Nice :D Models I've trained yesterday on 0.10.0 were very close with the same reamp trained with Tanh:

2-dor commented 1 month ago

I've trained a few more models with LeakyReLU and Tanh and the resulting ESR is pretty much at splitting hairs distance of each other. If LeakyReLU is indeed more CPU-efficient, I suspect a lot of folks would be happy to need less resources for the same profile / model quality.

2-dor commented 1 month ago

The behavior I observe carries over to custom architectures too.

Link contains a reamp.wav file (I used the v3_0_0.wav for it) and the "lightning_logs" folder contents for the same architecture trained with Tanh and LeakyReLU.

https://mega.nz/file/Hk40lLLB#VOCaPdFBSJyTxZNYSdHD5WNu6dDTAJvpdC8DOnt2j24