Add option to train Gated SAEs

In a recent DeepMind paper, Improving Dictionary Learning with Gated Sparse Autoencoders (Rajamanoharan et al 2024), the authors suggest a Pareto improvement to the regular dictionary learning procedure.

This PR adds that recipe to the dictionary_learning repo.

There are also some other small changes in applying the black formatter, applying isort to sort imports and adding some type hints to the relevant areas of the codebase.

I ran the example.py script to check that the GatedSAE training loop runs as expected.

Excited to train GatedSAEs!

Let me know if you have any questions about this, the formatting changes make it seem like there's a lot of change but really most of the changes are localised to dictionary.py in the GatedAutoEncoder class and in editing the training script/loss function to work with this.

(Also I'm using regular black formatting settings because there isn't a formatting specified in the repo but if you have a formatter that you prefer please let me know or feel free to just run your own auto-formatter on top of this change before you merge!)

saprmarks / dictionary_learning

Add option to train Gated SAEs #16