stanfordnlp / pyreft

ReFT: Representation Finetuning for Language Models
https://arxiv.org/abs/2404.03592
Apache License 2.0
947 stars 77 forks source link

[P0] Additional intervention arguments are not saved correctly, e.g. `add_bias` #82

Open frankaging opened 1 month ago

frankaging commented 1 month ago

Descriptions:

For customizable interventions, people might want to save the interventions with their customized arguments. For instance, the dropout ratio, the activation function type or whether to add a special bias term to any projection matrix. Currently, these additional arguments are not saved.

The main change will be done in pyvene tracked by this parent PR: https://github.com/stanfordnlp/pyvene/issues/157. After the parent PR is checked in, we should also verify pyreft works with updated pyvene library.