Closed astonzhang closed 4 years ago
http://preview.d2l.ai.s3-website-us-west-2.amazonaws.com/d2l-en/master/chapter_deep-learning-computation/parameters.html
mxnet:
print(net[0].collect_params()) print(net.collect_params()) dense0_ ( Parameter dense0_weight (shape=(8, 4), dtype=float32) Parameter dense0_bias (shape=(8,), dtype=float32) ) sequential0_ ( Parameter dense0_weight (shape=(8, 4), dtype=float32) Parameter dense0_bias (shape=(8,), dtype=float32) Parameter dense1_weight (shape=(1, 8), dtype=float32) Parameter dense1_bias (shape=(1,), dtype=float32) )
pytorch
print(net[1].state_dict()) print(net.state_dict()) OrderedDict() OrderedDict([('0.weight', tensor([[ 0.2233, 0.1815, -0.1880, 0.1780], [ 0.1493, -0.4033, -0.3357, -0.1170], [-0.4171, -0.2477, -0.4834, -0.2077], [-0.4015, 0.2357, 0.1285, 0.4564], [-0.4385, -0.2682, -0.0510, -0.2132], [-0.1044, 0.4734, 0.1390, 0.2341], [-0.2781, 0.2203, 0.4285, -0.4425], [ 0.1697, 0.0497, 0.0042, -0.2616]])), ('0.bias', tensor([ 0.3599, -0.4421, -0.1519, 0.1739, -0.2889, 0.1194, 0.4794, 0.4822])), ('2.weight', tensor([[ 0.1173, 0.3268, 0.3000, -0.2517, -0.2242, 0.0704, -0.1405, -0.3193]])), ('2.bias', tensor([-0.1843]))])
For instance, net[1].state_dict() returns an empty OrderedDict(), which is inconsistent with the mxnet output
net[1].state_dict()
class MyInit(init.Initializer): def _init_weight(self, name, data): print('Init', name, data.shape) data[:] = np.random.uniform(-10, 10, data.shape) data *= np.abs(data) >= 5 net.initialize(MyInit(), force_reinit=True) net[0].weight.data()[0:2] Init dense0_weight (8, 4) Init dense1_weight (1, 8) array([[ 0. , -0. , -0. , 8.522827 ], [ 0. , -8.828651 , -0. , -5.6012006]])
pt:
def my_init(m): if type(m) == nn.Linear: nn.init.uniform_(m.weight, -10, 10) m.weight.data *= m.weight.data.abs() >= 5 net.apply(my_init) net[0].weight[0:2] tensor([[ 7.4014, -8.7963, 0.0000, -6.2305], [-6.9865, 0.0000, -0.0000, -0.0000]], grad_fn=<SliceBackward>)
Can we do print('Init', name, data.shape) in pt?
print('Init', name, data.shape)
@astonzhang Consistency fix in #1235
http://preview.d2l.ai.s3-website-us-west-2.amazonaws.com/d2l-en/master/chapter_deep-learning-computation/parameters.html
mxnet:
pytorch
For instance,
net[1].state_dict()
returns an empty OrderedDict(), which is inconsistent with the mxnet outputmxnet:
pt:
Can we do
print('Init', name, data.shape)
in pt?