OpenMOSS / Language-Model-SAEs

For OpenMOSS Mechanistic Interpretability Team's Sparse Autoencoder (SAE) research.
21 stars 3 forks source link

fix(activation gen): remove bos. ce score is greatly improved #24

Closed Hzfinfdu closed 5 days ago