All weights set to NaN - Githubissues

ghost commented 9 years ago

Hello,

First of all, thanks a lot for sharing your work, it is really interesting.

About my problem, I'm trying to train a CNN using RGB images as input. My training set has a size of kXSize = 33x33x3x55000. I use a typical architecture I,C,S,C,S,C,S,F,F, no modifications to any hyperparameter. I define the input layer as : struct('type', 'i', 'mapsize', kXSize(1:2), 'outputmaps', kXSize(3)). Input matrices are normalized and correctly defined.

Problem is, after the first epoch of training, I keep getting a vector of NaN weights. Would you have any idea why weights are not computed properly? It seems I don't specify the input properly, but I can't find out where, as your program runs 'normally' (it takes about 1500sec to do 1 epoch of training).

Thanks for your attention,

Quentin

sdemyanov commented 9 years ago

Hi Quentin,

I suppose the reason might be in too large learning rate, or weight initialization. Try to decrease the rate and increase/decrease the weight variance ('initstd' parameter). It should help.

Regards, Sergey.

On Wed, Jul 15, 2015 at 1:02 AM, Bardeux notifications@github.com wrote:

Hello,

First of all, thanks a lot for sharing your work, it is really interesting.

About my problem, I'm trying to train a CNN using RGB images as input. My training set has a size of kXSize = 33x33x3x55000. I use a typical architecture I,C,S,C,S,C,S,F,F, no modifications to any hyperparameter. I define the input layer as : struct('type', 'i', 'mapsize', kXSize(1:2), 'outputmaps', kXSize(3)).

Problem is, after an epoch of training, I keep getting a vector of NaN weights. Would you have any idea why weights are not computed properly? It seems I don't specify the input properly, but I can't find out where, as your program runs 'normally' (it takes about 1500sec to do 1 epoch of training).

Thanks for your attention,

Quentin

— Reply to this email directly or view it on GitHub https://github.com/sdemyanov/ConvNet/issues/23.

PhD candidate, Computing and Information Systems, The University of Melbourne.

Sergey Demyanov http://www.demyanov.net/

ghost commented 9 years ago

Hi Sergey,

Thanks for your answer, it seems indeed that the CNN wasn't converging. I have some decent weights now (I think) after tuning down the alpha and momentum parameters, but I just got another problem...

My CNN is a simple classifier (output of size 1). My training and testing sets have half examples of positive, and half negative. Problem is after training, the cnntest function returns a prediction vector with only ones, and it is accepted as a valid answer for every sample. I have a 0% error eventhough my testing set contains negative examples. I can't understand why the CNN acknowledges only 'positive' as a response, as my sets are evenly distributed.

Sorry for bothering you with this, I'm still not used to neurals networks.

Thanks again,

Quentin

sdemyanov commented 9 years ago

Hi Quentin,

I guess your problem now is that you use the softmax output layer, which is supposed to output probabilities. You either need to change the type of the nonlinear function on the last layer and the loss function on 'squared', or, more preferably, change the format of your labels - output size 2, each column correspond to a particular class, all values are zeros expect one '1' for the right class.

Regards, Sergey.

On Wed, Jul 15, 2015 at 9:36 PM, Bardeux notifications@github.com wrote:

Hi Segey,

Thanks for your answer, it seems indeed that the CNN wasn't converging. I have some decent weights now (I think) after tuning down the alpha and momentum parameters, but I just got another problem...

My CNN is a simple classifier (output of size 1). My training and testing sets have half examples of positive, and half negative. Problem is after training, the cnntest function returns a prediction vector with only ones, and it is accepted as a valid answer for every sample. I have a 0% error eventhough my testing set contains negative examples. I can't understand why the CNN acknowledges only 'positive' as a response, as my sets are evenly distributed.

Sorry for bothering you with this, I'm still not used to neurals networks.

Thanks again,

Quentin

— Reply to this email directly or view it on GitHub https://github.com/sdemyanov/ConvNet/issues/23#issuecomment-121587749.

ghost commented 9 years ago

Hi Sergey,

I'm ashamed I didn't realize this. I really need to work more on neural networks.

Thanks a lot for your help, it was extremely useful.

Good luck with your future plans,

Best regards,

Quentin

sdemyanov commented 9 years ago

No problem, thank you)

On Wed, Jul 15, 2015 at 11:21 PM, Bardeux notifications@github.com wrote:

Hi Sergey,

I'm ashamed I didn't realize this. I really need to work more on neural networks.

Thanks a lot for your help, it was extremely useful.

Good luck with your future plans,

Best regards,

Quentin

— Reply to this email directly or view it on GitHub https://github.com/sdemyanov/ConvNet/issues/23#issuecomment-121614141.

MosTec1991 commented 9 years ago

Hi

RGB or Gray image have range 0 - 255 for each pixel. to fix this problem you must have range 0 - 1 for each pixel. try this line of code when you import your data:

MyPicData=MyPicData/255;

sdemyanov / ConvNet

All weights set to NaN #23