Closed devnkong closed 2 years ago
Also do you have some reason for choosing eigen vectors with small eigen values?
Hi @devnkong, thanks for your questions.
Q: Why sign flipping is not used during eval? A: The random sign flipping during the training is to allow the network to be invariant or independent of the choices among 2^k possibilities. By this approach then, the sign flipping is not required during eval.
Q: choosing eigen vectors with small eigen values? A: Please refer to Section E.1.2 in https://arxiv.org/pdf/2003.00982.pdf
Best, Vijay
Thanks!
Hi Vijay,
Thanks for your repo!
Question: I see your doing sign flipping of eigen pos_enc during training, but it seems that you are not doing so during eval time. I understand that we want to make deterministic predictions so we don't have random flipping when evaluating it. Do you have further comments or justification for this?
Best Kezhi