Open astonzhang opened 4 years ago
http://preview.d2l.ai.s3-website-us-west-2.amazonaws.com/d2l-en/master/chapter_deep-learning-computation/parameters.html
1.
mxnet:
print(net[0].collect_params()) print(net.collect_params()) dense0_ ( Parameter dense0_weight (shape=(8, 4), dtype=float32) Parameter dense0_bias (shape=(8,), dtype=float32) ) sequential0_ ( Parameter dense0_weight (shape=(8, 4), dtype=float32) Parameter dense0_bias (shape=(8,), dtype=float32) Parameter dense1_weight (shape=(1, 8), dtype=float32) Parameter dense1_bias (shape=(1,), dtype=float32) )
tf:
print(net.layers[1].weights) print(net.get_weights()) [<tf.Variable 'sequential/dense/kernel:0' shape=(4, 4) dtype=float32, numpy= array([[ 0.5524189 , 0.23129171, 0.0363729 , -0.8600636 ], [-0.69835407, -0.06596345, 0.01897395, -0.5417439 ], [ 0.54055935, 0.6689728 , -0.8319559 , -0.09743792], [-0.1610511 , 0.49009317, -0.61211747, -0.45042837]], dtype=float32)>, <tf.Variable 'sequential/dense/bias:0' shape=(4,) dtype=float32, numpy=array([0., 0., 0., 0.], dtype=float32)>] [array([[ 0.5524189 , 0.23129171, 0.0363729 , -0.8600636 ], [-0.69835407, -0.06596345, 0.01897395, -0.5417439 ], [ 0.54055935, 0.6689728 , -0.8319559 , -0.09743792], [-0.1610511 , 0.49009317, -0.61211747, -0.45042837]], dtype=float32), array([0., 0., 0., 0.], dtype=float32), array([[ 0.0090847 ], [-1.0178163 ], [-0.29936522], [-0.6696218 ]], dtype=float32), array([0.], dtype=float32)]
For instance, bias is missing.
2.
net.collect_params()['dense1_bias'].data() array([0.])
net.get_weights()[1] array([0., 0., 0., 0.], dtype=float32)
The output does not look like bias.
3.
class MyInit(init.Initializer): def _init_weight(self, name, data): print('Init', name, data.shape) data[:] = np.random.uniform(-10, 10, data.shape) data *= np.abs(data) >= 5 net.initialize(MyInit(), force_reinit=True) net[0].weight.data()[0:2] Init dense0_weight (8, 4) Init dense1_weight (1, 8) array([[ 0. , -0. , -0. , 8.522827 ], [ 0. , -8.828651 , -0. , -5.6012006]])
print(net.layers[1].weights[0]) <tf.Variable 'sequential_6/dense_13/kernel:0' shape=(4, 4) dtype=float32, numpy= array([[0.02371812, 0.67190015, 0.40087283, 0.56996346], [0.42595625, 0.5223805 , 0.34758675, 0.5847038 ], [0.22081661, 0.97955835, 0.9585841 , 0.5245316 ], [0.59826577, 0.59225726, 0.25385475, 0.30986 ]], dtype=float32)>
a) print('Init', name, data.shape) is missing in tf b) net[0].weight should access [0:2] in tf
4.
mxnet
net[0].weight.data()[0] array([42. , 1. , 1. , 9.522827])
net.layers[1].weights[0] <tf.Variable 'sequential_6/dense_13/kernel:0' shape=(4, 4) dtype=float32, numpy= array([[42. , 1.6719002, 1.4008728, 1.5699635], [ 1.4259562, 1.5223805, 1.3475868, 1.5847038], [ 1.2208166, 1.9795583, 1.9585841, 1.5245316], [ 1.5982658, 1.5922573, 1.2538548, 1.30986 ]], dtype=float32)>
The outputs have different shapes.
Reference fix for PyTorch https://github.com/d2l-ai/d2l-en/pull/1235. I'll take a look at the TF version later.
http://preview.d2l.ai.s3-website-us-west-2.amazonaws.com/d2l-en/master/chapter_deep-learning-computation/parameters.html
1.
mxnet:
tf:
For instance, bias is missing.
2.
mxnet:
tf:
The output does not look like bias.
3.
mxnet:
tf:
a) print('Init', name, data.shape) is missing in tf b) net[0].weight should access [0:2] in tf
4.
mxnet
tf:
The outputs have different shapes.