amasky / ram

Recurrent Models of Visual Attention (RAM) with Chainer
MIT License
44 stars 8 forks source link

why np.identity? #2

Open machanic opened 7 years ago

machanic commented 7 years ago

in train.py file:

if not args.lstm:
    data = model.core_hh.W.data
    data[:] = np.identity(data.shape[0], dtype=np.float32)
amasky commented 7 years ago

Actually, this is not the way in the original paper. According to arXiv:1504.00941, initializing the weight of RNN with identity matrix may improve performance. But I think this does not affect in this case. I will delete this code later.