[P1] Intuitive-wise, should we keep the projection orthogonal during training?

Edenzzzz commented 2 months ago

Hi, thanks for the inspiring work! I have a confusion though: the code initializes orthogonal projections, but I can't seem to find how it can be kept orthogonal during training. In that case we can't recover the input even if Wh + b is identity? Thanks!

frankaging commented 2 months ago

@Edenzzzz thanks for your interests.

to keep orthogonal during training, i think you just have to call as: https://github.com/stanfordnlp/pyreft/blob/main/pyreft/interventions.py#L25

pytorch reparameterizes the weights by calling orthogonalization every step automatically (please refer to this torch tutorial on different reparameterizations).

on intuition:

orthogonal projection gives cleaner linear subspace, which helps interpretability studies. it might also help with composing ReFT together at inference time (since different subspaces are disentangled).
orthogonal projection might help with convergence, and make ReFT easier to tune.
in our next draft update, we will provide a set of ablation studies where we remove some of the constraints existed in LoReFT, orthogonalization is one of them. we will provide our code for NoReFT here: https://github.com/stanfordnlp/pyreft/blob/main/pyreft/interventions.py#L62

NoReFT is another ReFT method which removes orthogonalization. There are a bunch of other variants in the interventions.py right now -- feel free to check them out.

quick summary of what we found in the ablation study: LoReFT still performs the best yet other methods also work pretty well such as NoReFT which removes orthogonality constraint.

frankaging commented 2 months ago

marking this issue as closed for now --- feel free to reopen or open another issue in the future.

stanfordnlp / pyreft

[P1] Intuitive-wise, should we keep the projection orthogonal during training? #77