awslabs / mxboard

Logging MXNet data for visualization in TensorBoard.
Apache License 2.0
325 stars 47 forks source link

Cannot get gradient array for Parameter 'hybridsequential2_batchnorm0_running_mean' because grad_req='null' #37

Open onlyNata opened 5 years ago

onlyNata commented 5 years ago

net=nn.HybridSequential() with net.name_scope(): net.add(nn.Conv2D(64,kernel_size=3,strides=1,padding=1), nn.BatchNorm(), nn.Activation('relu')) ...... grads = [i.grad() for i in net.collect_params().values()] assert len(grads) == len(param_names)
for i, name in enumerate(param_names): sw.add_histogram(tag=name, values=grads[i], global_step=epoch, bins=1000)

File "F:\Anaconda3\envs\gluon\lib\site-packages\mxnet\gluon\parameter.py", line 522, in grad "because grad_req='null'"%(self.name))

RuntimeError: Cannot get gradient array for Parameter 'hybridsequential2_batchnorm0_running_mean' because grad_req='null'

szha commented 5 years ago

try replacing grads = [i.grad() for i in net.collect_params().values()] with grads = [i.grad() for i in net.collect_params().values() if i.grad_req != 'null']

BebDong commented 3 years ago

@onlyNata Same problems. Have you solved it?

tcfkaj commented 3 years ago

I can confirm that @szha 's solution worked for solving this error. However, I am now encountering a different error that may or may not be related. I don't want to hijack this thread so I will open another issue and link from here if I cannot troubleshoot quickly.

BebDong commented 3 years ago

Thanks, @tcfkaj. But @szha 's solution gets only part of the parameters' grads yet cannot access all grads. I am confused it does not work even explicitly set model.collect_params().setattr('grad_req', 'write')

tcfkaj commented 3 years ago

@BebDong In the case of BatchNorm, it makes sense that the _running_mean and _running_var would not be writeable and thus not trainable because their job is just to keep track of batch-level or global-level statistics. You can see in the source that they are always initialized with grad_req='null'. It appears that you cannot change grad_req from 'null' to 'write' directly for any of the parameters of BatchNorm. I am not sure how exactly this is enforced, but it makes sense for certain parameters.

BebDong commented 3 years ago

@tcfkaj Thanks! It helps a lot.