Closed daidaiershidi closed 1 month ago
Hi! Thanks for the interest in our work! Setting the option eq_t_map for the qk_fusion entry in the YAML config files, you consider together the Q and K activations/weights for the optimal transport problem. Then, the hard alignment solution returns a single permutation matrix that satisfies the given constraint.
Thank you for bringing such an interesting piece of work. In the paper, I noticed that you hoped for T_qk@T_qk^T=I. How you ensure this condition? I didn't find implementation of this in the code. Is this constraint necessary?