stanfordnlp / pyreft

ReFT: Representation Finetuning for Language Models
https://arxiv.org/abs/2404.03592
Apache License 2.0
1.12k stars 93 forks source link

[P1] Questions on differences between paper and code #95

Closed calpt closed 4 months ago

calpt commented 4 months ago

Hi pyreft team, thanks for the interesting paper and the open-source library!

We're currently working on integrating a couple of ReFT methods into our AdapterHub Adapters library (see here) and came across a few questions while studying your codebase:

1. Weight tying: In section 4.1 of the paper, the following definition for the tied and untied LoReFT variants is given: grafik

From my intepretation of this definition, in the untied variant, a separate intervention with an independent set of parameters $\phi$ would be added for each intervened position $p$.

However, looking at the implementation, in the untied variant, there seems to be one shared intervention for all prefixes and one shared intervention for all suffixes. E.g. here: https://github.com/stanfordnlp/pyreft/blob/3a2f19353adbdc1169f05a09ee89c70ea0d65c1b/examples/loreft/train.py#L161-L166

two interventions are created per layer, one for all prefixes and one for all suffixes if share_weights is set to False.

Is my understanding of the paper definition wrong?

2. DiReFT: In section 3.2 of the paper, DiReFT is defined as "(...) an ablation of LoReFT which removes the orthogonality constraint and the difference operation, reducing training time".

Looking at the implementation in pyreft, two related interventions are defined:

Thanks in advance for your insights!

frankaging commented 4 months ago

@calpt Thanks for your questions! and thanks for integrating ReFT into your library - I haven't looked at your PR yet, but I would certainly be happy to and provide feedback if that's okay.

Here are inline responses to your questions:

Weight tying: In section 4.1 of the paper, the following definition for the tied and untied LoReFT variants ...

Re: I think your interpretation of tied and untied weights is correct! In our paper, "tied weights" means we only have one intervention per layer. For "untied weights", we have two interventions per layer: one shared among prefix tokens, and one shared among suffix tokens. Our two definitions on pg. 6 could be a little confusing, but essentially they try to illustrate the fact that, we remove the positional dependency, and to have layer-dependent interventions.

This is minor but I also want to call out the intervention config we introduced in the paper itself is also pretty limited. One could also potentially try to tie weights among layers, or among a set of locations and layers, etc.. It would be great if your library could support that! We have limited bandwidth, but are also working towards more flexible intervention schemas.

DiReFT: In section 3.2 of the paper, DiReFT is defined as "(...) an ablation of LoReFT which removes the orthogonality constraint ...

Re: Yes! Our namings are outdated. DiReFT in the paper is actually the NodireftIntervention. We will change this soon, and rename current DireftIntervention to LodireftIntervention!

calpt commented 4 months ago

Thanks for your quick & detailed answers!

I think your interpretation of tied and untied weights is correct! In our paper, "tied weights" means we only have one intervention per layer. For "untied weights", we have two interventions per layer: one shared among prefix tokens, and one shared among suffix tokens. Our two definitions on pg. 6 could be a little confusing, but essentially they try to illustrate the fact that, we remove the positional dependency, and to have layer-dependent interventions.

Got it, this makes sense. Might be useful to correct the definitions in section 4.1 in a future version of the paper to avoid confusion.

One could also potentially try to tie weights among layers, or among a set of locations and layers, etc.. It would be great if your library could support that! We have limited bandwidth, but are also working towards more flexible intervention schemas.

These sound like sensible additional configurations to experiment with and integrate. For now, we try to cover the the essential configurations covered in the paper to make sure we have a first version ready soon. But definitely open to extend beyond this afterwards.