Open Sypotu opened 7 years ago
I also set batch_size to 8 because of my pc,and I also meet your problem.Have you find the cause?
set batchsize to16, also.
not the reason ofbatchsize,because i have trained on another better pc,and prediction is also 0.
maybe the reason of his fc8 using relu.
Yes i have removed the relu after fc8 and it's working fine now
have you made other changes?i remove relu layer after fc8,and set substract_mean is true because i use another dataset,but prediction are always same(not only 0,but alos other numbers)
Are they the same from the first step or do they slowly converge to always same values?
(In first picture,the first list [122,18,69,100,38...] is the real label of validation data,and the second line [116,116,116,116...] is the predictions on validation data.The predictions on test data is also like this ) This is my training process,as you can see,the predictions on validation data are always same,but they are different every time of validate.
How did you change the network to remove the relu ? I added this :
def _fc_norelu(name, in_, outsize, reuse=False):
with tf.variable_scope(name, reuse=reuse) as scope:
# Move everything into depth so we can perform a single matrix multiply.
insize = in_.get_shape().as_list()[-1]
weights = _variable_with_weight_decay('weights', shape=[insize, outsize], wd=0.004)
biases = _variable_on_cpu('biases', [outsize], tf.constant_initializer(0.0))
fc = tf.matmul(in_, weights) + biases
_activation_summary(fc)
print name, fc.get_shape().as_list()
return fc
And I have replaced fc8 = _fc("fc8", fc7, n_classes) by fc8 = _fc_norelu('fc8', fc7, n_classes)
I just modify the fc8 in inference_multiview function of model.py,it should work likely yours.I will try yours.
Hi,
I am training the network initialized with _alexnetimagenet.npy using the rendered views furnished by @WeiTang114 (https://drive.google.com/open?id=0B4v2jR3WsindMUE3N2xiLVpyLW8). The only thing I have changed is to reduce the batch size from 16 to 8 because my GPU doesn't have enough memory.
But quite quickly (after 300-400 steps) the network get stuck classifying all the inputs as class 0. (accuracy corresponding to random guess) Can this be due to the reduction of the batch size or there is another reason?
Thank you for your help!