SAELens exists to help researchers:
Please refer to the documentation for information on how to:
SAE Lens is the result of many contributors working collectively to improve humanities understanding of neural networks, many of whom are motivated by a desire to safeguard humanity from risks posed by artificial intelligence.
This library is maintained by Joseph Bloom and David Chanin.
Feel free to join the Open Source Mechanistic Interpretability Slack for support!
Research:
Reference Implementations: