jorgenkg / python-neural-network

This is an efficient implementation of a fully connected neural network in NumPy. The network can be trained by a variety of learning algorithms: backpropagation, resilient backpropagation and scaled conjugate gradient learning. The network has been developed with PYPY in mind.
BSD 2-Clause "Simplified" License
297 stars 98 forks source link

Overfitting #21

Closed NEGU93 closed 7 years ago

NEGU93 commented 7 years ago

First of all I take this opportunity to thank you about your library. It's been great and of great help!

2nd. I'm getting overfiitting. Around 5000 epochs the Current Error starts to rise so training the net is not useful anymore and unless I put a max iterations it will keep on forever.

My question is... is there an easy way of using early stop or another anti-overfitting/regularization technique? (Although the library is very well documented I cannot find anything of the sort).

Thank you again.

jorgenkg commented 7 years ago

That's great to hear, and all feedback is very welcome!

As a first step, have you tried applying dropout? From my personal experience, when using dropout, I would stick with a momentum learning algorithm.

If that fails, it is not uncommon to keep track of previous fitness scores and weights, in order to pick the best weight set.

NEGU93 commented 7 years ago

Thank you for the quick response. Great, I haven't noticed the dropout option before, my bad. I'll give it a try.

Thank you!

NEGU93 commented 7 years ago

Ok, it did helped a lot... before I reached to 0.24 MSE before it started growing again and with 50% dropout I reach 0.17...

Anyway I did change the library to apply an early stop techinique as well. Your code is very well written and it was easy to read.

jorgenkg commented 7 years ago

That's good news. As far as I recall, the suggested dropout factor is even higher than 0.5

Adding an early stopping technique to the library sounds interesting. If you feel up for it, you are very welcome to open a pull request and I will review it

NEGU93 commented 7 years ago

I used 0.5 because I found a similar paper of a guy that did that but I'll make a monte carlo now because I have some time left.

I have my hand in tomorrow, after that I'll make a pull request to add the early stop, sounds good!