andyzoujm / representation-engineering

Representation Engineering: A Top-Down Approach to AI Transparency
https://www.ai-transparency.org/
MIT License
736 stars 88 forks source link

Contrast Vector - Add Code for Generation? #28

Open RobertMcCarthy97 opened 1 year ago

RobertMcCarthy97 commented 1 year ago

Hi,

I was wondering if code demonstrating how to use the contrast vector for text generation could be pushed? (Ideally for a 7B model!)

I am quite unclear on how to convert the contrast vector code from honesty_control_TQA.ipynb to allow for generation. And I am finding it difficult to parse all the contrast vector implementation details from the paper.

This would be very useful, so would really appreciate it!

Thanks, Robert

justinphan3110cais commented 1 year ago

Yes, we are planning to fully support this in the rep-control pipeline soon. Will keep this thread update

RobertMcCarthy97 commented 12 months ago

Great to hear, thanks!

wj210 commented 12 months ago

Hi, can i clarify regarding in rep_control_pipeline.py, how is self.wrapped_model used?

during the init fn: ` self.wrapped_model = WrappedReadingVecModel(model, tokenizer)

    self.wrapped_model.unwrap()
    self.wrapped_model.wrap_block(layers, block_name=block_name)
    self.block_name = block_name
    self.layers = layers
    super().__init__(model=model, tokenizer=tokenizer, **kwargs)

`

model is still set to model, and outputs = super().__call__(text_inputs, **kwargs), doesn't; it call the original model? how is the activations integrated into the outputs in this case?

Is the output text from the jupyter notebook example using a normal model call or generate fn?

Thanks!

justinphan3110cais commented 12 months ago

@wj210, you can either use wrapped_model.generate or model.generate when activation is set by wrapped_model. An example is showed the last cell of this notebook

wj210 commented 12 months ago

it shows outputs = wrapped_model(**encoded_inputs.to(model.device), output_hidden_states=True)['hidden_states'], but the outputs are overriden by the later model.generate

also, if i use wrappedmodel.generate, it doesn't take in other generate args such as do_sample, temp, etc...? def generate(self, prompt, max_new_tokens=100, random_seed=0, use_cache=True):

justinphan3110cais commented 12 months ago

The outputs = wrapped_model(**encoded_inputs.to(model.device), output_hidden_states=True)['hidden_states'] should be deleted. MB and thanks for the catch.

I think you may have an older version of the code for wrapped_model.generate

merlinarer commented 11 months ago

Hello, any progress on this thread ? or Just give us some instructions so that we can help to achieve it.

justinphan3110cais commented 11 months ago

Hi @merlinarer @RobertMcCarthy97, sorry for the late updates. We have just update the TQA_honesty notebook that support Llama and Mistral models (CasadingLlamaForCausalLM, CasadingMistralForCausalLM)

merlinarer commented 11 months ago

Hi @merlinarer @RobertMcCarthy97, sorry for the late updates. We have just update the TQA_honesty notebook that support Llama and Mistral models (CasadingLlamaForCausalLM, CasadingMistralForCausalLM)

Awesome!! Is there any guide for us to support other models ?

justinphan3110cais commented 11 months ago

We have just updated the code a bit to include generation honesty_contrast_vec_TQA_generation.

@merlinarer let us know what other architecture you would like to support. ContrastVecMistralForCausalLM (changed from CasadingMistralForCausalLM) should works with most of the Mistral based models like zephyr, openchat, etc.

YerongLi commented 4 months ago

The notebook only showed how to use ContrastVecMistralForCausalLM without fine-tuning. I am wondering if we don't go through the tuning process in Algo 1, will the performance good as paper shown on TQA datasets?