Open gloddream opened 7 years ago
hi,in paper ,"The derivative of El with respect to the activations in layer l can be computed analytically:"
and this is your git code in method of "update":
if self.style_weights[l] > 0: diff = gram_matrix(x_feats[l]) - self.style_grams[l] n_channels = diff.shape[0] x_feat = ca.reshape(x_feats[l], (n_channels, -1)) style_grad = ca.reshape(ca.dot(diff, x_feat), x_feats[l].shape) norm = ca.sum(ca.fabs(style_grad)) weight = float(self.style_weights[l]) / norm style_grad *= weight grad += style_grad loss += 0.25*weight*ca.sum(diff**2) grad = layer.bprop(grad)
the question is ,where is the implement of red part which is marked,i can not find the contend code. thanks.
Hi, I believe the scaling term is incorporated when I precompute the Gram matrix
hi,in paper ,"The derivative of El with respect to the activations in layer l can be computed analytically:"
and this is your git code in method of "update":
the question is ,where is the implement of red part which is marked,i can not find the contend code. thanks.