Open DavideHe opened 7 months ago
as we all know, xpos has decay ability , but you add D of decay mat after Q @K^T .. Is It redundant ?
I think your concern is justified. The decay ability exists in xpos. The decay of matrix D may be redundant.
It seems that XPOS is not used directly in retention, but is split into Θ and D.
as we all know, xpos has decay ability , but you add D of decay mat after Q @K^T .. Is It redundant ?