save SAEFeatureExamples++ to persistent storage

What I think we should actually do:

tokenize dataset {seq len}
run a model on a sizable dataset, save all resid activations, logits and clean loss; upload to HF {model, tokenized dataset, seq len)
decompose activations with SAE, for each feature save (batch, pos) pairs where the feature was active + max feature activation {SAE}

object/step 1: tokenized dataset parameters: source dataset name, dataset split, tokenizer, seq. len

object/step 2: resid acts, logits and loss

object/step 3: active feature locations meaning: for each feature, have a list of (batch, seq) pairs where it was active +max activation

we can take it for a test run by just using a smal subset of the dataset

jettjaniak / teren