Closed mrsempress closed 1 week ago
@mrsempress Sorry about the late reply, but yes, i think that will be the original outputs.
The code to handle this flag is in pyvene
as in:
https://github.com/stanfordnlp/pyvene/blob/main/pyvene/models/intervenable_base.py#L1623
But feel free also to use model.generate()
where the model
is not the ReFT model but the original huggingface model.
In file
examples/loreft/compute_metrics.py
, line 211ori_response, steered_response = intervenable.generate(**generation_args)
, if I addgeneration_args["output_original_output"] = True
and take ori_response as the same as steered_response, can I get the model performance without intervention?I test on the command
and the accuracy is 5.0, while the llama-7b on gsm8k is 11.0. I also tested the accuracy of the llama2-7b, which is 26.3, while the llama2-7b on gsm8k is 14.6; the llama3-8b is 23.7, while I did not find the public report of the llama3-8 B. Is there something wrong, or I misunderstand the meaning of
output_original_output