ericwtodd / function_vectors

Function Vectors in Large Language Models (ICLR 2024)
https://functions.baulab.info/
104 stars 24 forks source link

Creating top_heads list in compute_universal_function_vector (extract_utils.py) for new models #13

Closed harisethuram closed 1 month ago

harisethuram commented 1 month ago

I would like to construct the top_head list in compute_universal_function_vector for new models (such as llama 3). In 1 and 10, you mention the script that computes the activations for one task, and the prompt settings. However, I'm not sure over which datasets specifically you aggregate to compute the average activations. Could you clarify this? Thanks!

ericwtodd commented 1 month ago

Yes, of course! To create the top_heads list, we aggregate the indirect effects from all the abstractive datasets for which the model does better than baseline performance (see appendix G in the paper; and E.2 for an example of baseline ICL performance - we use the majority label as our baseline). For GPT-J for example, there were 18 tasks that were used to compute the AIE of each head, based on GPT-J's ICL performance:

gptj_tasks = ['antonym', 'capitalize', 'capitalize_first_letter', 'country-capital', 
              'country-currency', 'english-french', 'english-german', 'english-spanish',
              'landmark-country', 'lowercase_first_letter',  'national_parks', 'park-country',
              'person-sport', 'present-past', 'product-company', 'sentiment', 'singular-plural', 'synonym']

Let me know if you have more questions, thanks!