ApolloResearch / rib

Library for methods related to the Local Interaction Basis (LIB)
MIT License
3 stars 0 forks source link

Split layernorm into two sequential modules #305

Closed danbraunai-apollo closed 6 months ago

danbraunai-apollo commented 6 months ago

Split layernorm into two sequential modules

Description

Related Issue

Closes #299

How Has This Been Tested?

Same tests pass. Notably, the sequential model with the new layer norm layers get the same per-module outputs as transformerlens for various models. Also, folding the bias does not affect the output.

Does this PR introduce a breaking change?

No, since we kept the ln1, ln2, ln_final names around, even though they don't really describe what the layer does (calculates variance).

danbraunai-apollo commented 6 months ago

I should note that I haven't actually built any graphs with this change. Before merging we should have a look at the graphs when the node layers include ["ln1", "ln1_out", , "ln2", "ln2_out"] to see if this change does what we're hoping it does.

We should also probably include a ln_out node layer in one of our tests to show that we can actually build a graph at this layer.

nix-apollo commented 6 months ago

Comparison of rib build graphs on the last layer of pythia. (1-0) basis and non-centered so not the most reliable graph.

Without splitting layer norm tinystories-whole-ln_rib_graph

With split layer norm tinystories-split-ln_rib_graph

I think we mostly see what we expect here, although:

But again, this is a somewhat mistaken basis.

nix-apollo commented 6 months ago

See this slack thread for some further experiments: https://apolloresearchhq.slack.com/archives/C06484S5UF9/p1706176556896229