openai / sparse_autoencoder

MIT License
320 stars 34 forks source link

Files with activations to run viewer locally #2

Open WuTheFWasThat opened 5 months ago

WuTheFWasThat commented 5 months ago

[this issue is migrated from a previous version of the repo]

@elephantmipt and @zdaiot ask:

Hi, thank you for releasing great tool! It would be great if you provide more details on which files SAE viewer uses for rendering particular feature. When I run published code locally, I get an error Failed to load resource: net::ERR_CONNECTION_REFUSED because here https://github.com/openai/sparse_autoencoder/blob/0296e02c8e3bce46015375d99ce53188c3d0167f/sae-viewer/src/interpAPI.ts#L4-L20 we try to connect to localhost:8000 load_az method.

I need an example of a file with activations that I can pass as JSON here https://github.com/openai/sparse_autoencoder/blob/0296e02c8e3bce46015375d99ce53188c3d0167f/sae-viewer/src/interpAPI.ts#L55 , in case I want to view features from my custom SAE. However, I am unsure of the format to use. Could you please provide example files with activations so that anyone can run the demo locally and/or modify the activations and models?

WuTheFWasThat commented 5 months ago

example file: https://openaipublic.blob.core.windows.net/sparse-autoencoder/viewer/gpt2-small/v5_32k/layer_8/resid_post_mlp/atoms/0-ablated.json

the ablation data is optional, simpler format is:

https://openaipublic.blob.core.windows.net/sparse-autoencoder/viewer/gpt2-small/v5_32k/layer_8/resid_post_mlp/atoms/0.json

zdaiot commented 5 months ago

@WuTheFWasThat Can you give me the process of how to generate these json files?

WuTheFWasThat commented 5 months ago

we just ran the model+autoencoder over a pretraining dataset and tracked highest activations per latent and random positive activations per latent. we don't have a plan to release the code atm, but if that would be helpful we can consider it

zdaiot commented 5 months ago

we just ran the model+autoencoder over a pretraining dataset and tracked highest activations per latent and random positive activations per latent. we don't have a plan to release the code atm, but if that would be helpful we can consider it

Thanks a lot. I think it will be very helpful. Please consider it.

Thank you for such an excellent job. Can I ask, when release training code?

taha-yassine commented 3 months ago

Hi, Thanks for the excellent work! l'm also interested in the code to generate the JSONs if you ever decide to release. I think it would be helpful.

w1051868626 commented 1 month ago

Me too。I'm also interested in generating JSON code if you decide to publish it. I think it'll help.