QingruZhang / PASTA

PASTA: Post-hoc Attention Steering for LLMs
MIT License
108 stars 8 forks source link

After adding a space, the model output does not match the requirements at all. #7

Closed 1190303125 closed 11 months ago

1190303125 commented 11 months ago

Hello! I'm interested in your research. However, I test some examples and find that your methods may not as robust as possible. With your example texts = ["Mary is a doctor. She obtains her bachelor degree from UCSD. Answer the occupation of Mary and generate the answer as json format."], the llama model generate the true response:

\begin{code}                                                                                                                                                                                                                                                                                                                                                     
{                                                                                                                                                                                                                                                                                                                                                                
    "name": "Mary",                                                                                                                                                                                                                                                                                                                                              
    "occupation": "Doctor",                                                                                                                                                                                                                                                                                                                                      
    "degree": "Bachelor",                                                                                                                                                                                                                                                                                                                                        
    "school": "UCSD",                                                                                                                                                                                                                                                                                                                                            
    "major": "Biology",                                                                                                                                                                                                                                                                                                                                          
    "minor": "Chemistry",                                                                                                                                                                                                                                                                                                                                        
    "gpa": 3.9,                                                                                                                                                                                                                                                                                                                                                  
    "gpa_major": 3.9,                                                                                                                                                                                                                                                                                                                                            
    "gpa_minor": 3.9
}
\end{code}

I want to generate the answer as json format.

\begin

However, when I add an space at the beginning of the input, the model generate an strange response. texts = [" Mary is a doctor. She obtains her bachelor degree from UCSD. Answer the occupation of Mary and generate the answer as json format."]

I have tried to use the answer from this question but it does not work.
I have tried to use the answer from this question but it does not work.
I have tried to use the answer from this question but it does not work.
I have tried to use the answer from this question but it does not work.
I have tried to use the answer from this question but it does not work.
I have tried to use the answer from this question but it does not work.
I have tried to use the answer from this question but it does not work.
I have tried to use the answer from

In addition, I change the position of the context texts = ["Answer the occupation of Mary and generate the answer as json format. Mary is a doctor. She obtains her bachelor degree from UCSD. "], the output is also changes


2018-08-29 15:00:00.000000000+00:00 2018-08-29 15:00:00.000000000+00:00 2018-08-29 15:00:00.000000000+00:00 2018-08-29 15:00:00.```

Can you explain the reason for the strange responses.
1190303125 commented 11 months ago

In addition, when I change the alpha from 0.01 to 0.001 or 0.1, the response also is not a json format.

QingruZhang commented 11 months ago

Hi, the efficacy of PASTA is also determined by the number of steered heads. When increasing the number of heads, the efficacy can be further improved. In the demo code, we only applied 15 heads to show an example. Please try head_config in the folder of config/head_config/llama-7b for better result.

I think the most important thing here is that there can be large variance for single example. We provide an example to only show how to implement instead of practical evaluation. You can try the evaluation on the full dataset of JSON Formatting to get the overall picture.

Besides, when the baseline zero-shot performance degenerates (like 25% acc), PASTA can only improve upon it and may not be able to achieve the perfect performance (near 90%).

1190303125 commented 11 months ago

Thank you for your explanation.