ai-safety-foundation / sparse_autoencoder

Sparse Autoencoder for Mechanistic Interpretability
https://ai-safety-foundation.github.io/sparse_autoencoder/
MIT License
191 stars 39 forks source link

Log feature density histograms to weights and biases #88

Open jbloomAus opened 1 year ago

jbloomAus commented 1 year ago

See here

To do:

There's a few technical details here that could matter but any solution here would be a good start.

HoagyC commented 11 months ago

This should be closed, histograms and number of dead features are now solidly in wandb.