Closed DavidUdell closed 9 months ago
feature_web now yields causal graphs for sparse autoencoder features, both on RASP toy models from tracr and on full-scale HF transformers.
feature_web
tracr
feature_web
now yields causal graphs for sparse autoencoder features, both on RASP toy models fromtracr
and on full-scale HF transformers.