QingruZhang / PASTA

PASTA: Post-hoc Attention Steering for LLMs
MIT License
108 stars 8 forks source link

I cannot reproduce the demo experiment #6

Closed Sniper970119 closed 9 months ago

Sniper970119 commented 11 months ago

I try with code

from pastalib.pasta import PASTA
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Initialize pre-trained LLM
name = "/alg_vepfs/public/models/llama2-7b-hf"
model = AutoModelForCausalLM.from_pretrained(name, trust_remote_code=True, device_map="auto",
                                             torch_dtype=torch.float32)

tokenizer = AutoTokenizer.from_pretrained(name, trust_remote_code=True)

head_config = {"3": [17, 6, 12], "5": [24], "0": [17], "7": [13], "11": [16], "8": [28, 24], "4": [3]}

# Initialize the PASTA steerer
pasta = PASTA(
    model=model,
    tokenizer=tokenizer,
    head_config=head_config,
    alpha=0.01,  # scaling coefficient
    scale_position="exclude",  # downweighting unselected tokens
)

# Model Input
texts = [
    "Mary is a doctor. She obtains her bachelor degree from UCSD. Answer the occupation of Mary and generate the answer as json format."]

# ===== Without PASTA =====
inputs = tokenizer(texts, return_tensors="pt")
inputs.to('cuda:0')
outputs = model.generate(**inputs, max_new_tokens=100)
# print(outputs)
print(tokenizer.decode(outputs[0]))
print('*' * 20)
# ---------------------
# ["The answer should be in json format."]  # returns answer in the wrong format

# ===== With PASTA =====
inputs, offset_mapping = pasta.inputs_from_batch(texts)
# User highlights specific input spans
emphasized_texts = ["Answer the occupation of Mary and generate the answer as json format"]
# PASTA registers the pre_forward_hook to edit attention
with pasta.apply_steering(
        model=model,
        strings=texts,
        substrings=emphasized_texts,
        model_input=inputs,
        offsets_mapping=offset_mapping
) as steered_model:
    outputs = steered_model.generate(**inputs, max_new_tokens=128)

print(tokenizer.decode(outputs[0]))

It same both output is not a json format

image
QingruZhang commented 11 months ago

Hi, the demo code is for huggyllama/llama-7b. The attention pattern of LLAMA2 will differ from LLAMA-7B and hence its head_config should be profiled specifically to be effective. Right now, the config/head_config includes the config of steered heads for GPTJ and LLAMA-7B. The example code for LLAMA-7B has been also been reproduced (please see this). We will release the attention heads for LLAMA-2 in future.

Sniper970119 commented 11 months ago

same to llama-7b

name = "/alg_vepfs/public/models/llama-7b"
model = AutoModelForCausalLM.from_pretrained(name, trust_remote_code=True, device_map="auto",
                                             torch_dtype=torch.float32)
tokenizer = AutoTokenizer.from_pretrained(name, trust_remote_code=True)

and get a result

image

maybe need a seed to reproduce demo experiment?

QingruZhang commented 11 months ago

The greedy search is applied in the demo code and hence there should be no randomness. Can you comfirm the local model file are the exactly same as the open-sourced huggyllama/llama-7b, i.e., the same forward pass and weights that have not been fine-tuned. Since we conduct the profiling for huggyllama/llama-7b, the demo head_config should be applied to the same model. For example, the quickstart code has been reproduced by others with huggyllama/llama-7b downloaded from huggingface: https://github.com/QingruZhang/PASTA/issues/4.