callummcdougall / sae_vis

Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
MIT License
140 stars 27 forks source link

supporting mlp and attn out hooks #19

Closed chanind closed 6 months ago

chanind commented 6 months ago

This PR supports the mlp_out and attn_out hooks in visualization. These get directly added into the residual stream, so shouldn't need any further processing, similar to the resid hooks