jbloomAus / SAELens

Training Sparse Autoencoders on Language Models
https://jbloomaus.github.io/SAELens/
MIT License
193 stars 67 forks source link
Screenshot 2024-03-21 at 3 08 28 pm

SAE Lens

PyPI License: MIT build Deploy Docs codecov

SAELens exists to help researchers:

Please refer to the documentation for information on how to:

SAE Lens is the result of many contributors working collectively to improve humanities understanding of neural networks, many of whom are motivated by a desire to safeguard humanity from risks posed by artificial intelligence.

This library is maintained by Joseph Bloom and David Chanin.

Tutorials

Join the Slack!

Feel free to join the Open Source Mechanistic Interpretability Slack for support!

Citations and References

Research:

Reference Implementations: