TransformerLensOrg / TransformerLens

A library for mechanistic interpretability of GPT-style language models
https://transformerlensorg.github.io/TransformerLens/
MIT License
1.45k stars 283 forks source link

Release 2.2.1 #668

Closed bryce13950 closed 2 months ago

bryce13950 commented 2 months ago

Current result projection for attention is incorrect. Type annotations would suggest that result isn't being summed over head_index, but in fact it is. I've edited the function so that it's no longer being summed over head_index.

Note, this bug caused the ARENA material to fail for the first transformers chapter, I've tested it and it now works.


Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist: