Closed miaott1234 closed 1 year ago
The legacy implementation of ProxylessNAS is known to be buggy for gradient problems like this. Fortunately, a large portion of these problems have been fixed in the latest implementation. But unluckily the latest version hasn't been released yet. Please hang tight for a while...
As far as I know, the official implementation of ProxylessNAS does not have gradient problems.
The legacy implementation of ProxylessNAS is known to be buggy for gradient problems like this. Fortunately, a large portion of these problems have been fixed in the latest implementation. But unluckily the latest version hasn't been released yet. Please hang tight for a while...
The official implementation of ProxylessNAS is also known to suffer from other problems. For example,
Thank you. I see. I am very much looking forward to the release of the latest version, is it possible to get it before April?
I'm not sure about whether we can get a stable release before April.
If you are interested in a preview, you can find it here (a working ProxylessNAS example): https://github.com/ultmaster/nni/blob/nas-nn-refactor/examples/nas/hub/proxyless_search.py
That's great Thank you very much
Describe the issue:
proxyless example can't train because the grad is none name: module.blocks.1.mobile_inverted_conv.ops.0.inverted_bottleneck.conv.weight -->grad_requirs: True --weight tensor(0.0037, device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.0.inverted_bottleneck.bn.weight -->grad_requirs: True --weight tensor(1., device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.0.inverted_bottleneck.bn.bias -->grad_requirs: True --weight tensor(0., device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.0.depth_conv.conv.weight -->grad_requirs: True --weight tensor(-0.0017, device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.0.depth_conv.bn.weight -->grad_requirs: True --weight tensor(1., device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.0.depth_conv.bn.bias -->grad_requirs: True --weight tensor(0., device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.0.point_linear.conv.weight -->grad_requirs: True --weight tensor(-0.0035, device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.0.point_linear.bn.weight -->grad_requirs: True --weight tensor(1., device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.0.point_linear.bn.bias -->grad_requirs: True --weight tensor(0., device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.1.inverted_bottleneck.conv.weight -->grad_requirs: True --weight tensor(0.0092, device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.1.inverted_bottleneck.bn.weight -->grad_requirs: True --weight tensor(1., device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.1.inverted_bottleneck.bn.bias -->grad_requirs: True --weight tensor(0., device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.1.depth_conv.conv.weight -->grad_requirs: True --weight tensor(-0.0001, device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.1.depth_conv.bn.weight -->grad_requirs: True --weight tensor(1., device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.1.depth_conv.bn.bias -->grad_requirs: True --weight tensor(0., device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.1.point_linear.conv.weight -->grad_requirs: True --weight tensor(0.0023, device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.1.point_linear.bn.weight -->grad_requirs: True --weight tensor(1., device='cuda:0') -->grad_value: None -->name: module.blocks.1.mobile_inverted_conv.ops.1.point_linear.bn.bias -->grad_requirs: True --weight tensor(0., device='cuda:0') -->grad_value: None
the layer build by LayerChoice has None grad_value, but the common layer such as first layer has grad.
Please check it Thx
Environment: