Open duyvuleo opened 6 years ago
My bad. layer_norm can work on matrix input. Sorry for this mis-understanding!
It seems that I actually confused the behaviour of layer_norm again. Current impl. of layer_norm does not perform position-wise layer normalisation, e.g., the input matrix and I want it to perform col-wise layer normalisation.
Yes, this would be nice to have but is not implemented yet. Actually layer norm is currently just implemented using a combination of basic dynet operations (e.g. it's not its own special optimized operation), so you could probably look at the implementation in expr.cc
and create your own dimension-wise version. If you have trouble doing so with the operations currently implemented in DyNet we can add the ones that are necessary.
Currently, I do it using the existing nodes dynet has:
dynet::Expression layer_norm_colwise(const dynet::Expression& x, const dynet::Expression& g, const dynet::Expression& b, float epsilon=1e-8){
dynet::Expression mu = dynet::concatenate(std::vector<dynet::Expression>(x.dim()[0], dynet::transpose(dynet::mean_dim(x, {0}))));
dynet::Expression sigma = dynet::concatenate(std::vector<dynet::Expression>(x.dim()[0], dynet::transpose(dynet::std_dim(x, {0}))));
dynet::Expression x_centered = x - mu;
return dynet::cmult(g, dynet::cdiv(x_centered, sigma + epsilon)) + b;
}
I don't think it is beautiful. And it seems to be slow actually.
I am looking at this native impl.: https://github.com/marian-nmt/marian-dev/blob/7432024c7de7c2b928b1654d62afb7b9834ed934/src/kernels/tensor_operators.cu.
Do you think we should have the same?
Hi all,
Is it useful to have a layer_norm_2d (the input is a matrix) in addition to layer_norm? Currently, I tried a naive version but it may be slow.
Any suggestion? I would be happy to work on this!
Thanks!