TransformerLensOrg / TransformerLens

A library for mechanistic interpretability of GPT-style language models
https://transformerlensorg.github.io/TransformerLens/
MIT License
1.17k stars 241 forks source link

add n k v heads to model properties table #610

Closed anthonyduong9 closed 1 month ago

anthonyduong9 commented 1 month ago

Description

Adds "n_kv_heads" to the Model Properties Table in the documentation. Some models use multi-query attention (which uses a single key and value head) or grouped-query attention (which uses multiple key and value heads, but less than the number of query heads). This change lets users easily see which models use these.

Fixes # https://github.com/TransformerLensOrg/TransformerLens/issues/522

Type of change

Please delete options that are not relevant.

Screenshots

Please attach before and after screenshots of the change if applicable.

Before

Screenshot 2024-05-27 at 5 10 05 PM Screenshot 2024-05-27 at 5 10 18 PM

After

Screenshot 2024-05-27 at 5 08 53 PM Screenshot 2024-05-27 at 5 09 09 PM

Checklist:

bryce13950 commented 1 month ago

@anthonyduong9 Thanks for adding these keys. Could you fill in the template on your original comment? I know this a draft, but it would be good to have an idea of what is remaining on this to understand what you are all planning.

anthonyduong9 commented 1 month ago

@anthonyduong9 Thanks for adding these keys. Could you fill in the template on your original comment? I know this a draft, but it would be good to have an idea of what is remaining on this to understand what you are all planning.

@bryce13950 No problem. I've filled out the template. Let me know if anything's unclear.

bryce13950 commented 1 month ago

Beautiful