alandiamond / spinnaker-neuromorphic-classifier

Python PyNN code demonstrating classification using SpiNNaker neuromorphic hardware. This work was developed at the University of Sussex under funding from the Human Brain Project (HBP).
13 stars 8 forks source link

How do you get the input data when you train the network #2

Open DaisyRQin opened 8 years ago

DaisyRQin commented 8 years ago

Hi. I tried to test the code. However, I found that I cannot get the input data directly from the MNIST database images. It is said in the “A neuromorphic network for generic multivariate data classification,” that the VR responses was generated by neural gas algorithm using mdp. I am wondering how you doing that. Because if I do not know how to input data, I will never be able to test new pictures and that really bothers me.

Could you pleased just give me some details on how to get your ClassActivation_SpikeSourceData.csv and VrResponse_SpikeSourceData.csv.

Thanks a lot.

tnowotny commented 8 years ago

Hi, just a quick note that Alan is on leave at the moment until in about one week's time. I am sure he'll be happy to give you some pointers then.

DaisyRQin commented 8 years ago

Thank you so much. I have been puzzled on this issue in about two weeks. I tried to train the neuralgasnode on mdp but still could not get the same data.

DaisyRQin commented 8 years ago

Hi. I am wondering if Alan is back and can give me some tips. I am quite upset these days. Thanks so much.

alandiamond commented 8 years ago

Hi If you read the Method section in these 2 papers (Paper 1 http://loop.frontiersin.org/publications/44025851 , Paper 2 http://iopscience.iop.org/article/10.1088/1748-3190/11/2/026002?fromSearchPage=true) it explains in detail how the model input is generated from the dataset that you wish to classify. What is it you are trying to do? Do you have a spinnaker board? Alan

On Thu, Jul 28, 2016 at 9:54 PM, DaisyRQin notifications@github.com wrote:

Hi. I tried to test the code. However, I found that I cannot get the input data directly from the MNIST database images. It is said in the “A neuromorphic network for generic multivariate data classification,” that the VR responses was generated by neural gas algorithm using mdp. I am wondering how you doing that. Because if I do not know how to input data, I will never be able to test new pictures and that really bothers me.

Could you pleased just give me some details on how to get your ClassActivation_SpikeSourceData.csv and VrResponse_SpikeSourceData.csv.

Thanks a lot.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/alandiamond/spinnaker-neuromorphic-classifier/issues/2, or mute the thread https://github.com/notifications/unsubscribe-auth/AJOebAgznQWa60ezv3T7ciLmHEroPKi7ks5qaRcPgaJpZM4JXojz .

DaisyRQin commented 8 years ago

Thank you so much. Yes, I have a spinnaker board now, and I am trying to use this code to classify the digits pictures from MNIST database. I am doing summer intern in Imperial College right now. It is the first time I got in touch with the board.

I still have some problems with the method, as I've read the second article you mentioned before two weeks. I am quite puzzled on how neural gas works. I installed MDP package and tried to use the neural gas to train the data. But I found nothing to output. Is that true that you initialize the neural gas Node with the num_node = 50, and then put one 28*28 image into training? Then what can I do after that?

I haven't read the first article and I will study it tonight. I really hope to get everything right. I am sorry to bother you.

Many thanks.

Rui

alandiamond commented 8 years ago

The MDP library's implementation of neural gas takes a numpy array as input, where each row describes each item in the training set you have chosen. So for MNIST each row in the array should contain 28 contiguous sets of of 28 pixels ie. 784 data points, one for each pixel in the image. However, the standard MNIST dataset is supplied in an unusual compressed format which has to be decoded to create the format described above. Its also a large dataset and this python implementation will take a long time to run if you gave it all 60,0000 training images. So you need to also select a subset of images to use as your training set.

You will need to decode the raw MNIST to create the datafile you want. There are probably python programs on the Internet that do this already, I used a C++ program that I found.

I have attached a python program i wrote to use mdp to generate n x VRs from a comma delimeted training data file which will be loaded into the numpy array. You pass the value of n and the path to the datafile as arguments as well as the filename in which to place the set of resulting VRs.

Having got your set of VR's you would need to create a VR response to represent each image in numVR-dimensional space (see papers on how the response is worked out). The strength of these responses determine the spike rate you use to generate entries in the spike source file for each of the numVR input neurons.

Its not a trivial undertaking I'm afraid ;-)

HTH

Alan

On Mon, Aug 8, 2016 at 5:28 PM, DaisyRQin notifications@github.com wrote:

Thank you so much. Yes, I have a spinnaker board now, and I am trying to use this code to classify the digits pictures from MNIST database. I am doing summer intern in Imperial College right now. It is the first time I got in touch with the board.

I still have some problems with the method, as I've read the second article you mentioned before two weeks. I am quite puzzled on how neural gas works. I installed MDP package and tried to use the neural gas to train the data. But I found nothing to output. Is that true that you initialize the neural gas Node with the num_node = 50, and then put one 28*28 image into training? Then what can I do after that?

I haven't read the first article and I will study it tonight. I really hope to get everything right. I am sorry to bother you.

Many thanks.

Rui

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alandiamond/spinnaker-neuromorphic-classifier/issues/2#issuecomment-238291574, or mute the thread https://github.com/notifications/unsubscribe-auth/AJOebBLUX293tX-5Ol03_c91yctPuauTks5qd1kZgaJpZM4JXojz .

DaisyRQin commented 8 years ago

Thanks a lot. This is very helpful and I will try it.

Rui

DaisyRQin commented 8 years ago

BTW, I haven't found the code you mentioned using mdp to generate n x VRs from a comma delimeted training data file which will be loaded into the numpy array.

alandiamond commented 8 years ago

Maybe github emails removes attachments? What is your direct email?

On Tue, Aug 9, 2016 at 12:23 PM, Rui Qin notifications@github.com wrote:

BTW, I haven't found the code you mentioned using mdp to generate n x VRs from a comma delimeted training data file which will be loaded into the numpy array.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alandiamond/spinnaker-neuromorphic-classifier/issues/2#issuecomment-238524965, or mute the thread https://github.com/notifications/unsubscribe-auth/AJOebPUBYP2DmmW1LVMSuPUkhIA1OZzWks5qeGMcgaJpZM4JXojz .

DaisyRQin commented 8 years ago

My email is daisyqin@outlook.com.

Thank you so much.

DaisyRQin commented 8 years ago

Hi, Alan. Sorry to interrupt you again. I used 500 images to train the neural gas node and got VRs generated by your code, which is 50 * 784 matrix. Then I got another 500 images to calculate the VR responses. In the paper 2 http://iopscience.iop.org/article/10.1088/1748-3190/11/2/026002?fromSearchPage=true%3E there is a formula of how to calculate VR response, and I got r with 50 * 500. But in the "A neuromorphic network for generic multivariate data classification", which is the reference paper of the original code, it has a totally different core formula. Then I got another set of r with 50 * 500. However, those two r are all different from what you provided in the original code. I got each ri in [0,1], but your data is much more than that. I don't know if I need to calculate the spiking rate. But it seems in the VrResponse_SpikeSourceData.csv, you just provided VrResponses. And I also have some questions of the VrResponse formula. It requires maximum and minimum manhattan distance observed in the data set, or the average distance between all the input points in the set. I don't know how to do if I just have one test image. I thought I can use the training maximunm, training minimum and also training average to get the VrResponse. But I do not have your data used in the original spinnaker code. I really want to train the network by myself, but again I can't find support materials on how to get ClassActivation_SpikeSourceData.csv. I do think it contains information of image labels, without which I cannot train the network. That's what I tried but failed again this week. I read those articles again and again but have to ask for help. I believe there's something I haven't noticed but I don't know how to do it. I am really appreciate if you can give me more tips. Here are those two formula of calculating VrResponses, r_temp = 1 - ( d_temp [ j ] - dmin ) / ( dmax - dmin ) or r_temp = np.exp(-((5 * d_temp[j])/davg)**0.7)

d_temp[j] is the distance of the jth input data and the current Vr. dmin is the minimum distance. dmax is the maximun distance. And davg is the average distance between all the input points and the current VR.

Although my summer intern is about to finished, but I really want to know how to do these things.

Many thanks.

Rui

alandiamond commented 8 years ago

It sounds like you are nearly there. You have a [0,1] response from each of 50 VR's for every entry in the test set (500 images). You are going to present eacg image for say 200ms. Each input neuron (50 neurons for 50 VRs) is going to rate code its corresponding VR response. The max spike rate will be say, 200Hz represting a response of r=1.0 , 100Hz representing r=0.5 etc. The spike source data file has a row for each input neuron (50 neurons for 50 VRs) Each row specifies a list of times (ms) that the neuron should spike. The times depend on the spike frequency required. Imagine that VR0 responds with r=0.5 for the first image and r=1 for the second image. 100Hz implies spike times at 10ms intervals (0,10,20,30...200) After 200ms we switch to spikes at 200Hz, implies spike times at 5ms intervals (205,210,215,220...400) So in the first row of the spike source data we would see 0,10,20,30...200, 205,210,215,220...400

THis is what you see in the example spike source file. The only difference is that to avoid every neuron spiking at t=0 the first spike time is dithered, (eg selected randmly between 0 and 10).

The ClassActivation_SpikeSourceData.csv is the same format, but targets the 10 output neurons (one per class lable) during training to create spiking activity in the correct output . This causes a Hebbian association (weight potentiation) to be created by STDP between PN neurons and a particular class. You should see in the spike source data that each row has large gaps bewtween firing periods, this is becuase each row represents a single class (digit) and only appears in the data every 10th digit.

DaisyRQin commented 8 years ago

Thank you so much, Alan. It sounds very interesting. I will try what I can do tonight.