pluskid / Mocha.jl

Deep Learning framework for Julia
Other
1.29k stars 254 forks source link

Parameter flexibility question #207

Closed qfn closed 8 years ago

qfn commented 8 years ago

Having seen some unexpected (by me) behavior with the power layer it would help to know which parameters are being optimized by the solver in the neural network structure.

The weights and the biases in the InnerProductLayer are optimized.

Are the parameters in the PowerLayer optimized (scale, shift and power)?

Are the epsilons in Neurons.ReLU() / LReLU() optimized?

If not, is there any way to make the NN to search for the parameters mentioned above?

Thank you.

pluskid commented 8 years ago

No they are not optimized. To optimize with them, you need to write your own layers and define them as parameters, compute the gradients with respect to those. Alternatively, you can do grid search, treating them as other hyper parameters.

qfn commented 8 years ago

Thank you.

Cody-G commented 7 years ago

I just dug up this issue because I'm also interested in creating a PowerLayer that can optimize its parameters (actually I only need to optimize scale). @qfn, did you end up implementing this? Your work could give me a head start. Thanks!