WeiTang114 / MVCNN-TensorFlow

An Multi-View CNN (MVCNN) implementation with TensorFlow.
MIT License
120 stars 67 forks source link

The reason why constant loss #2

Open weiweisunWHU opened 7 years ago

weiweisunWHU commented 7 years ago

Hello Wei, I have prepared the data and trained the models without changing anything. However, I found the loss converging to 3.69. Then I changed initial learning rate but obtaining same converged loss (3.69). Do you know is there any problem? By the way, could you please provide the trained weight? Thanks a lot.

WeiTang114 commented 7 years ago

Could you post your a rendered view image? A possible problem is that you have "grey object on white background". We need "white object on black backgroud".

weiweisunWHU commented 7 years ago

Thank you for your reply! It is one of the rendered images. airplane_0001_004 Did you successfully train this model? Because I find that your FC8 layer is followed by a RELU layer. That's why the model finally outputs a stable value (0).

WeiTang114 commented 7 years ago

Would you try inverting the views so that they have black background? White background would make the activations unstable. You can also try my rendered views: https://drive.google.com/open?id=0B4v2jR3WsindMUE3N2xiLVpyLW8

weiweisunWHU commented 7 years ago

Thank you very much! I successfully trained the model too by using mean-subtraction. Still, I discarded the last ReLU layer. Anyway, Thanks a lot for your kind help.

ghost commented 7 years ago

@WeiTang114 I used the rendered views(given by your link) as input and still get a constant loss of approximately 3.69 even after 25 epochs! Could you tell me what might be the issue? Also, is there a way to visualize the loss/accuracy vs epoch?

Thanks!

WeiTang114 commented 7 years ago

@priyam1994

ghost commented 7 years ago

@WeiTang114 Thanks for your reply! I used a learning rate of 0.001 for my training as suggested.

WeiTang114 commented 7 years ago

@priyam1994 Then I'm not very sure what problem it may be. I would see if gradients exploded (in the "distribution" tab in tensorboard), or weights are not initialized well, etc.

weiweisunWHU commented 7 years ago

In my perspective, the learning rate for deep layer (fc layers) should be multiplied by 10. It works for me. I recommend it for you @priyam1994.

ghost commented 7 years ago

@weiweisunWHU Could you tell me how you changed the learning rate only for the fc layers? Additionally, did you train the network from scratch or used the pre-trained model to fine tune? Thank you.

weiweisunWHU commented 7 years ago

@priyam1994 For example: opt1=tf.train.AdamOptimizer(lr*10).minimize(loss,var_list=listvar_update) the var_list should be the variables list of the fc layers.

Xmen0123 commented 7 years ago

@weiweisunWHU I am having the same problem. With this information alone I could not solve it. I'm sorry, could you tell me in a bit more detail? For example, I want to know the source code change.

WeiTang114 commented 7 years ago

@weiweisunWHU @priyam1994 @Xmen0123 Sorry! I found there was a typo in readme. The "--learning-rate=0.001" should be "--learning_rate=0.001", so that argument didn't work...

Also, I found 0.0001 is more reliable for training.

I've updated the code (commit b476e17f11bd540f4f962ae157f20c17067996b2).

youkaichao commented 7 years ago

I also get the magic number:3.69. My rendered view is in white background with the size of 224x224. airplane_0627_012

With the constant loss of 3.69, I got a poor accuracy of just 2%. Too sad…… There must be something wrong……

WeiTang114 commented 7 years ago

@youkaichao It should work with black background. Mine is at the comment above. Otherwise, you may just invert the images offline or online (invert the image in input.py, after reading the images at line 27).

If simply inverting your images works, I'll consider adding an option such as "--white_background=True" 😅

youkaichao commented 7 years ago

@WeiTang114 It works! After adding the code below in input.py, after line 27, the loss is no longer constantly 3.69 and the accuracy is decent now. im = cv2.bitwise_not(im)

But I'm puzzled. What's the difference between white and black? If I feed the MVCNN with white-background views, it's expected to identify the white-background views, isn't it?

WeiTang114 commented 7 years ago

@youkaichao Black is 0 and white is 255. My theory is that zero-background passed into convolution layers (practically matrix multiplications) leads to zero outputs, while the object in greyscale has informative output after the convolution. Thus black background makes the activation of the layers more stable than white.

youkaichao commented 7 years ago

Now the problem is solved, I got an accuracy of 85%. Not state-of-art, but reasonably good. Thank you! I'll go fine-tuning now ^_^

ghost commented 7 years ago

@WeiTang114 Your suggestion worked and I could achieve a test accuracy of 88%. But it always results in the constant 3.69 loss if I use the caffe alexnet model, any thoughts? @youkaichao Did you train from scratch or did you use the alexnet model? If you did use the pretrained model, did you make any changes? Thanks!

youkaichao commented 7 years ago

@priyam1994 I'm using the pretrained alexnet model, and it works. I have made no changes to the pretrained alexnet model.

ghost commented 7 years ago

@youkaichao Thank you for the clarification. Did you maybe change any other parameters than what has been originally suggested?

youkaichao commented 7 years ago

@priyam1994 Nope. I run MVCNN with default settings. And I don't know how to replace the alexnet model …… Is it possible that you forgot to run ./prepare_pretrained_alexnet.sh?

ghost commented 6 years ago

@WeiTang114 Could you tell me if there is a specific reason the rendered images(specified in the link given by you) are of dimension 600 600? Can I also feed in images of other dimensions, say 300 300 or 400*400?

Thank you

WeiTang114 commented 6 years ago

@priyam1994 That's just random. After input into the network the images are resized to 256 and cropped to 227, so I just wanted a large enough size so that the images aren't distorted after resized. Of course any other size is fine!

ghost commented 6 years ago

@WeiTang114 Thank you for your and informative reply!

rlczddl commented 6 years ago

@weiweisunWHU hi,can you tell me what changes have you made?i set substract_mean is true and remove relu layer after fc8,and the prediction are always same(not only 0,but also other numbers).

rlczddl commented 6 years ago

@youkaichao hi,i want to know you just add "cv2.bitwise_not(im)" after line27 in input.py,no other chages,then it works?why i cannot.

491506870 commented 5 years ago

@youkaichao @WeiTang114 hello, after i use your code "cv2.bitwise_not(im)" after line 27, my acc is still about 2%, for example, sometimes it's "acc=1.953125", sometimes it's "acc=2.734375" and other value, and i haven't changed anything in the code, the dataset i train is ModelNet40 with write background like your airplane image. Do you know what should i do? thank you so much.

491506870 commented 5 years ago

@weiweisunWHU your method really works!i tried it successfully to more than 85% accuracy.

WChen1996 commented 5 years ago

@491506870 Hi, would you please specify the method? Like which code you have changed. Thanks a lot!