dtch1997 / steering-bench

Evaluation suite for steering vectors
0 stars 0 forks source link

`RuntimeError: generator didn't stop` when evaluating steering vector #1

Open dtch1997 opened 1 month ago

dtch1997 commented 1 month ago

Weird error involving context managers that I don't fully understand

To reproduce:

pdm run pytest tests/integration/test_train_and_evaluate_steering_vector.py

Stack trace:

============================================================================================ FAILURES =============================================================================================
_____________________________________________________________________________ test_train_and_evaluate_steering_vector _____________________________________________________________________________

    def test_train_and_evaluate_steering_vector():

        model_name = "meta-llama/Llama-2-7b-chat-hf"
        train_spec = DatasetSpec("xrisk/coordinate-itself", "0%:+3")
        eval_spec = DatasetSpec("xrisk/coordinate-itself", "50%:+3")
        layer = 13
        steering_token_index = -2

        train_dataset = load_dataset(train_spec)
        eval_dataset = load_dataset(eval_spec)

        model, tokenizer = get_model_and_tokenizer(model_name)
        formatter = ChatFormatter(tokenizer)
        pipeline = Pipeline(model, tokenizer, formatter)

        steering_config = SteeringConfig(
            layer = layer,
            multiplier = 0,
            skip_first_n_generation_tokens = 1,
        )

        steering_vector_training_data = build_steering_vector_training_data(
            pipeline,
            train_dataset,
            steering_token_index=steering_token_index,
        )

        steering_vector = train_steering_vector(
            pipeline.model,
            pipeline.tokenizer,
            steering_vector_training_data,
            layers=[layer],
            show_progress=False,
        )

        sweep = make_sweep_layers_and_multipliers(
            config=steering_config,
            layers=[layer],
            multipliers=[-1.0, 0, 1.0],
        )

>       results = evaluate_steering_vector_sweep(
            sweep,
            pipeline,
            steering_vector,
            eval_dataset,
        )

tests/integration/test_evaluate_steering_vector.py:72: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
steering_bench/steering/evaluate_steering_vector.py:56: in evaluate_steering_vector_sweep
    result = evaluate_steering_vector(
steering_bench/steering/evaluate_steering_vector.py:34: in evaluate_steering_vector
    result = evaluate(
steering_bench/evaluate.py:174: in evaluate
    positive_probs = pipeline.logprobs(example.positive_completion)
steering_bench/core/pipeline.py:133: in logprobs
    with ExitStack() as stack:
/usr/lib/python3.11/contextlib.py:601: in __exit__
    raise exc_details[1]
/usr/lib/python3.11/contextlib.py:586: in __exit__
    if cb(*exc_details):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <contextlib._GeneratorContextManager object at 0x7841803231d0>, typ = None, value = None, traceback = None

    def __exit__(self, typ, value, traceback):
        if typ is None:
            try:
                next(self.gen)
            except StopIteration:
                return False
            else:
                try:
>                   raise RuntimeError("generator didn't stop")
E                   RuntimeError: generator didn't stop

/usr/lib/python3.11/contextlib.py:149: RuntimeError
-------------------------------------------------------------------------------------- Captured stderr call ---------------------------------------------------------------------------------------
`low_cpu_mem_usage` was None, now set to True since model is quantized.
Loading checkpoint shards: 100%|██████████| 2/2 [00:06<00:00,  3.08s/it]
We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)
Evaluating: 100%|██████████| 3/3 [00:01<00:00,  2.10it/s]
Evaluating:   0%|          | 0/3 [00:00<?, ?it/s]
===================================================================================== short test summary info =====================================================================================
FAILED tests/integration/test_evaluate_steering_vector.py::test_train_and_evaluate_steering_vector - RuntimeError: generator didn't stop
chanind commented 1 month ago

The issue is that there are 2 yield statements in SteeringHook::__call__(). A contextmanager can only yield once before exiting. It looks like the intention is that if the multiplier is 0, to just yield, but the bug arises because the code continues and hits the second yield statement. This can be solved by adding an else: statement after the first yield, to ensure that only one yield runs. E.g.:

            # multiplier 0 is equivalent to no steering, so just skip patching in that case
            if self.config.multiplier == 0:
                yield
            else:
                handle = layer_sv.patch_activations(
                    model=context.pipeline.model,
                    layer_config=self.config.layer_config,
                    multiplier=self.config.multiplier,
                    min_token_index=min_token_index,
                    operator=get_patch_operator(self.config.patch_operator),
                )
                yield