Open glcLucky opened 3 years ago
I guess this formula may be wrong. Should we change that to this: np.sqrt(epsilon / embedding_dim)?
Why do you say so ?
I guess this formula may be wrong. Should we change that to this: np.sqrt(epsilon / embedding_dim)?