openworm / neuronal-analysis

Tools to produce, analyse and compare both simulated and recorded neuronal datasets
MIT License
4 stars 4 forks source link

Numerical differentiation and Improvements to RNN #13

Closed BenjiJack closed 8 years ago

BenjiJack commented 8 years ago

Hi all, here is some additional work on the Kato "pipeline" and recurrent neural networks.

Kato pipeline

Note: I am quite unfamiliar with the technique developed by Chartrand and used in this off-the-shelf script. Nonetheless I am trying to use it here to see if we can reproduce Kato's analysis.

In this image from the Jupyter notebook, you see Kato's raw data, Kato's derivative data, and our PCA on Kato's derivative data; followed by Kato's raw data, our computation of the derivative, and our computation of PCA on our derivative data. image

RNNs

Much of the code is taken directly from @jrieke's implementation here - thank you for allowing me to use your GMM functions as I try to make this work.

I tried various configurations of the model, with and without a stateful LSTM layer, feeding in 1 data point at a time vs longer chunks, changing the number of mixture_components, changing the learning rate, using different optimizers, but to no avail -- still getting NaNs. As I am new to RNNs, and in particular don't understand all of the GMM code, I may be making an egregious error in how I'm using the model, or it might be a more subtle problem.

A demo of how I am using the model can be found at laboratory/RNNDemo.ipynb

Next steps My goal is to get the RNN working, and then add a 3rd row to the image above, in which we compare data we generate from the RNN, differentiate, and then perform PCA on. In this way, we would reproduce Kato's analysis using Kato's own data, and also do a similar analysis using data that we generate. Future analyses and simulations could take a similar approach -- analyze the Kato data using some tool we build, generate similar data with a simulation, and then use the same analytic pipeline to compare our generated data to Kato's.

jrieke commented 8 years ago

@BenjiJack About the nan's: In which range is your data? You should normalize it to ~ [-1, 1] (an even smaller range might help if you still get nan). If you have already taken care of this, try again to play around with the learning rate (and the other parameters). The GMM layer makes the whole thing very prone to numeric errors, so you sometimes have to reduce the learning rate by a few orders of magnitude to make the whole thing work (I know it's annoying..). See also the discussion here (the GMM layer there is based on the same code as mine).

Also, I've been refactoring a bit of my code into a separate project during the last few days, which should be completely agnostic of the type of data you use for the network: timeseries-rnn. It's still experimental and poorly documented, but maybe you can give it a try to see if there's a mistake in your code (it's written as python scripts, but you can easily pull the code into a jupyter notebook or run the scripts via the %run script.py magic from a notebook).

BenjiJack commented 8 years ago

Thank you @jrieke. It sounds like I may not have normalized the data properly. I did try to vary the learning rate without success. I will try the normalization and see what happens.

The model did sometimes converge using random normally distributed data in the (0,1) range (although it still did so unreliably), so maybe it is indeed the normalization that's the problem.

Your new codebase looks exciting and should be very helpful as well. Thank you for sharing it.

Traveling the next few days, will come back soon once I have a chance to dig in further. On Apr 22, 2016 8:14 PM, "Johannes Rieke" notifications@github.com wrote:

@BenjiJack https://github.com/BenjiJack About the nan's: In which range is your data? You should normalize it to ~ [-1, 1](an even smaller range might help if you still get nan). If you have already taken care of this, try again to play around with the learning rate (and the other parameters). The GMM layer makes the whole thing very prone to numeric errors, so you sometimes have to reduce the learning rate by a few orders of magnitude to make the whole thing work (I know it's annoying..). See also the discussion here https://github.com/fchollet/keras/issues/1608 (the GMM layer there is based on the same code as mine).

Also, I've been refactoring a bit of my code into a separate project during the last few days, which should be completely agnostic of the type of data you use for the network: timeseries-rnn https://github.com/jrieke/timeseries-rnn. It's still experimental and poorly documented, but maybe you can give it a try to see if there's a mistake in your code (it's written as python scripts, but you can easily pull the code into a jupyter notebook or run the scripts via the %run script.py magic from a notebook).

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/openworm/neuronal-analysis/pull/13#issuecomment-213632670