jbloomAus / SAELens

Training Sparse Autoencoders on Language Models
https://jbloomaus.github.io/SAELens/
MIT License
386 stars 106 forks source link

feat: activation norm scaling factor folding #170

Closed jbloomAus closed 4 months ago

jbloomAus commented 4 months ago

Description

Sometimes SAEs are trained on normalized activations. We should really be saving these with the norm scaling factor already folded in, but in the meantime I've written a convenience function to fold the scaling factor in and refactored it's calculation so the method belongs to the activation store.

I think we're working up to a "processing step" in load from pretrained where we basically have this and similar stuff handled by default.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

Checklist:

You have tested formatting, typing and unit tests (acceptance tests not currently in use)

codecov[bot] commented 4 months ago

Codecov Report

Attention: Patch coverage is 88.88889% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 59.15%. Comparing base (f1908a3) to head (fcef286). Report is 3 commits behind head on main.

Files Patch % Lines
sae_lens/training/sae_trainer.py 33.33% 1 Missing and 1 partial :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #170 +/- ## ========================================== + Coverage 56.35% 59.15% +2.79% ========================================== Files 25 25 Lines 2603 2595 -8 Branches 440 439 -1 ========================================== + Hits 1467 1535 +68 + Misses 1061 983 -78 - Partials 75 77 +2 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.