The current activation store implementation has some drawbacks. Maybe we need to add some new features for streaming activation store and make some optimizations. Below I list some details.
Text Dataset Collate Config
We need to support SAE training on both pretraining and SFT data, unlike Anthropic's Scaling Monosemanticity in which only pretrained data is used to train SAEs on a supervised finetuned model.
IMO pretraining data should be packed and SFT data should be sorted by length and batched with post paddings. Activations in the residual stream of s should be ignored in SAE training. I believe this is better fitted to real-world distribution.
We need to add into the configuration to configure this.
[x] Support two types of activation gen
Shuffle
When training SAEs with data from multiple distributions, shuffling should be an option to add to diversity of information in a batch. This can be implemented by filling in the activation buffer with random sources.
The current activation store implementation has some drawbacks. Maybe we need to add some new features for streaming activation store and make some optimizations. Below I list some details.
IMO pretraining data should be packed and SFT data should be sorted by length and batched with post paddings. Activations in the residual stream ofs should be ignored in SAE training. I believe this is better fitted to real-world distribution.
We need to add into the configuration to configure this.