ai-safety-foundation / sparse_autoencoder

Sparse Autoencoder for Mechanistic Interpretability
https://ai-safety-foundation.github.io/sparse_autoencoder/
MIT License
137 stars 35 forks source link