Closed san9569 closed 3 years ago
Hi @san9569 ,
Yes, ReLU should be fine! In fact, you may want to check this paper: https://arxiv.org/pdf/2006.08591.pdf. They also use ReLU (but parameterized W differently to ensure provable convergence).
When you use your proposed MLP as layer f, it may get unstable. In that case, you can (1) ensure your parameters are initialized to very small values, (2) add weight decay if needed, and (3) add normalizations if needed.
I'm closing this issue for now, but do let me know (and feel free to re-open) if you have other questions!
I'm closing this issue for now, but do let me know (and feel free to re-open) if you have other questions!
Thanks for your kind answer and for recommending the paper!
Hello
First of all, I'm really appreciate for you to share your awesome code. I want to train a MLP model with DEQ that is shown in Chapter 4 of your nice tutorial page. (http://implicit-layers-tutorial.org/deep_equilibrium_models/)
For example, if f_theta is :
f_theta(z, x) = relu(Wz + x + b)
I wonder whether I can train it with the equilibrium model. Thank you