pluskid / Mocha.jl

Deep Learning framework for Julia
Other
1.29k stars 254 forks source link

Applying to real-valued target outputs #197

Open hpoit opened 8 years ago

hpoit commented 8 years ago

What are the advantages of applying Mocha to real-valued target outputs, and how do I do it?

mcreel commented 8 years ago

With real valued outputs, you're using the net to do nonlinear regression. An example of how to do it is at https://github.com/mcreel/NeuralNetsForIndirectInference.jl

On Tue, Apr 26, 2016 at 8:37 PM, Henry Poitras notifications@github.com wrote:

What are the advantages of applying Mocha to real-valued target outputs, and how do I do it?

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/pluskid/Mocha.jl/issues/197

hpoit commented 8 years ago

Thanks

mcreel commented 8 years ago

Hi Henry, With no hidden layer and without using any nonlinear activation function, a simple NN with resl valued outputs is just a linear regression model. Training the net by backpropagation is a slow way to learn the parameters. The fast way is to use the OLS solution. But conceptually, the net is a linear regression model. Once the activation function is nonlinear, and with hidden layers, the net is a nonlinear regression model, often with a great number of parameters. The nice thing about neural nets is that even if they have many parameters, compared to what someone who works with ordinary regression models would be used to, they learn to set the superfluous parameters to zero, or close to it. This is supposing that you have a large training and testing set of data, with small data sets, superfluous parameters is more of a problem.

In my experience, working with problems where data sets can be very large, trying out several different net configurations (numbers of neurons and layers) and choosing the net that gives best performance with the testing set has given very nice results. I don't have experience with problems with small data sets, where selecting the configuration would be more difficult (I imagine).

On Tue, Apr 26, 2016 at 11:05 PM, Henry Poitras notifications@github.com wrote:

Thanks Michael. I wonder if it makes sense to program the net to identify and apply linear and/or nonlinear regressions when applicable?

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/pluskid/Mocha.jl/issues/197#issuecomment-214886095

hpoit commented 8 years ago

@pluskid Could you guide me in adjusting the net to correctly predict the targets on train.csv?

https://s3-sa-east-1.amazonaws.com/nu-static/workable-data-science/data-science-puzzle.zip