Closed jacky4323 closed 5 years ago
could you please remove the padding parameter or set pad=0, and try them again? Let me know the result, thanks!
Hi, this is a result set pad=0 has correct output, pad=1 output will become nan
conv2_act1 = mx.sym.QActivation(data=batch1_3, act_bit=1, backward_only=True, name="conv2_act1") conv2_1 = mx.sym.QConvolution( data=conv2_act1,pad=(0, 0), kernel=(3,3), num_filter=128, act_bit=1, weight_bit=1,cudnn_off=False, name="conv2_1") relu2_1 = mx.symbol.Activation(data=conv2_1, act_type="relu", name="relu2_1") batch2_1 = mx.sym.BatchNorm(data=relu2_1, name="batch2_1") conv2_act2 = mx.sym.QActivation(data=batch2_1, act_bit=1, backward_only=True, name="conv2_act2") conv2_2 = mx.sym.QConvolution( data=conv2_act2,pad=(1, 1), kernel=(3,3), num_filter=128, act_bit=1, weight_bit=1, cudnn_off=False,name="conv2_2")
Thanks for your report! I think this is a bug in the binary cuDNN conv layer. Wir will try to fix it as soon as possible! Before that you could use the normal binary conv layer (slower) or try to prevent padding.
please check our new version BMXNet v2: https://github.com/hpi-xnor/BMXNet-v2
Hi, I use cudnn_off=False ,but the output is nan when I use mx.mon.Monitor when training(left hand side of the figure below),If I use cudnn_off=True,the output value is more reasonable(right hand side of the figure) Could you please help me?thanks a lot!!