ApolloResearch / rib

Library for methods related to the Local Interaction Basis (LIB)
MIT License
3 stars 0 forks source link

Separated out attention scores into module to be cache-able #246

Closed stefan-apollo closed 9 months ago

stefan-apollo commented 9 months ago

AttentionScores module

Description

Add a Class AttentionScores that carries out the computation from q,k to attn_scores. This allows us to hook and cache this computation.

Motivation and Context

This was required for interp.

How Has This Been Tested?

Does this PR introduce a breaking change?

No

Confusions

I am a bit confused why my cached activations were a list (of length 1) and I had to do

cache_long["sections.section_0.0.attention_scores"]["acts"][0]

but I class this as "that's just how our hooks work"

stefan-apollo commented 9 months ago

Adding this module revealed issue #245, tracking separately