Closed sieu-n closed 2 years ago
https://github.com/dropreg/R-Drop/blob/3d97565595747f3b3d9c4701cb2fb824a9139913/vit_src/models/modeling.py#L298
Isn't L298 supposed to be the following?
loss += self.alpha * (kl_loss + reverse_kl_loss) / 2
Hi @krenerd,
Since the hyper self.alpha can control the loss weight, therefore it impacts not much about this. But thanks for your reminder, and we will revise accordingly.
https://github.com/dropreg/R-Drop/blob/3d97565595747f3b3d9c4701cb2fb824a9139913/vit_src/models/modeling.py#L298
Isn't L298 supposed to be the following?