Closed nurlanov-zh closed 1 year ago
Hi Zhakshylyk,
I think we just assumed normal distribution. Sounds interesting to consider other distributions. Could you please explain a bit why making the distribution of W_i non-continuous at 0 would potentially help the training?
Thanks! Zhouxing
Hi,
Have you tried other distributions for IBP init? Does it make difference? Since the IBP training has non-continuity at
W_i = 0
, would it make sense to initialize it with Laplace distributionW_i ~ Laplace(0, b)
? Then|W_i| ~ Exponential(b^{-1})
, sob = 2/n_i
would also work.Best regards, Zhakshylyk