dmlc / minpy

NumPy interface with mixed backend execution
https://minpy.readthedocs.io/en/latest/
Other
1.11k stars 112 forks source link

Wrong implementation of the xavier initializer #158

Open wddabc opened 7 years ago

wddabc commented 7 years ago

The Xavier init in https://github.com/dmlc/minpy/blob/master/minpy/nn/init.py#L33 is wrong. According to Glorot et al (2010) Eq.(1) and Eq.(16), the weights are sampled from a uniform dist, not a normal dist (I assume the npr.randn is a normal dist as numpy.random does).

I tried to change L33 to: ret = npr.uniform(low=-var, high=var, size=shape) But it caused another exception in MXNet, this might be another issue

Updated: Even the above is fixed, it is still wrong as the Xavier initialization depends on different activation functions. See https://github.com/clab/dynet/pull/295 and https://github.com/clab/dynet/issues/348 for discussions.