Log feature density histograms to weights and biases

ai-safety-foundation / sparse_autoencoder

Sparse Autoencoder for Mechanistic Interpretability

MIT License

191 stars 39 forks source link

Open jbloomAus opened 1 year ago

jbloomAus commented 1 year ago

To do:

There's a few technical details here that could matter but any solution here would be a good start.

HoagyC commented 11 months ago

This should be closed, histograms and number of dead features are now solidly in wandb.