Open fblissjr opened 1 month ago
@fblissjr Yes! We recently released one SAE tutorials on hidden layers, not the embedding layers. But if you specify the component to be the embedding layer output, you could essentially replicate the results in this paper IIUC: https://github.com/stanfordnlp/pyvene/blob/main/tutorials/basic_tutorials/Sparse_Autoencoder.ipynb
Suggestion / Feature Request
Been curious for awhile now, then moreso since reading Disentangling Dense Embeddings with Sparse Autoencoders (https://arxiv.org/html/2408.00657v2)
It looks like most of the ingredients in pyvene are here to to do this with text embeddings?