ApolloResearch / rib

Library for methods related to the Local Interaction Basis (LIB)
MIT License
3 stars 0 forks source link

Add Stochastic Noise Sources (SNS) attribution method #254

Closed danbraunai-apollo closed 8 months ago

danbraunai-apollo commented 9 months ago

Add Stochastic Noise Sources (SNS) for Edges

Description

Motivation and Context

The current edge attribution method needs to iterate over output position dimensions. With stochastic sources, we only need to iterate over the stochastic source dimension, which should be << output_pos_dim.

How Has This Been Tested?

There are more extensive comparison/testing code in https://github.com/ApolloResearch/interp/blob/main/interp/dan/rib/stochastic_edge_comparison.py (and results thread here). This code shows that the stochastic sources do seem to converge to the true edges for a variety of settings (in pythia).

Right now, we're not doing very thorough testing and scaling laws for this.

Does this PR introduce a breaking change?

No