Note - I'm planning for a major refactor of this library soon - branch can be found here (containing demo examples too). The current way it works is an unholy patchwork of Python / HTML / JavaScript; the new version is much simpler: the vis is created with a minimal pre-existing HTML framework, instead it's populated using JavaScript, and the only way Python interfaces with JavaScript is to dump a single DATA
dictionary into the JavaScript page. I've also created an Othello SAE vis, pictured below (also see it on my personal website homepage). I plan to get around to pushing updates to this library in late September / early October, so watch this space!
This codebase was designed to replicate Anthropic's sparse autoencoder visualisations, which you can see here. The codebase provides 2 different views: a feature-centric view (which is like the one in the link, i.e. we look at one particular feature and see things like which tokens fire strongest on that feature) and a prompt-centric view (where we look at once particular prompt and see which features fire strongest on that prompt according to a variety of different metrics).
Install with pip install sae-vis
. Link to PyPI page here.
Important note - this repo was significantly restructured in March 2024 (we'll remove this message at the end of April). The recent changes include:
Here is a link to a Google Drive folder containing 3 files:
In the demo Colab, we show the two different types of vis which are supported by this library:
To cite this work, you can use this bibtex citation:
@misc{sae_vis,
title = {{SAE Visualizer}},
author = {Callum McDougall},
howpublished = {\url{https://github.com/callummcdougall/sae_vis}},
year = {2024}
}
This project is uses Poetry for dependency management. After cloning the repo, install dependencies with poetry install
.
This project uses Ruff for formatting and linting, Pyright for type-checking, and Pytest for tests. If you submit a PR, make sure that your code passes all checks. You can run all checks with make check-all
.
0.2.9
)0.2.9
- added table for pairwise feature correlations (not just encoder-B correlations)0.2.10
- fix some anomalous characters0.2.11
- update PyPI with longer description0.2.12
- fix height parameter of config, add videos to PyPI description0.2.13
- add to dependencies, and fix SAELens section0.2.14
- fix mistake in dependencies0.2.15
- refactor to support eventual scatterplot-based feature browser, fix ’
HTML0.2.16
- allow disabling buffer in feature generation, fix demo notebook, fix sae-lens compatibility & type checking0.2.17
- use main branch of sae-lens
0.2.18
- remove circular dependency with sae-lens
0.2.19
- formatting, error-checking0.2.20
- fix bugs, remove use of batch_size
in config0.2.21
- formatting