openai / automated-interpretability

977 stars 116 forks source link

More unified dataset #4

Closed shayneoneill closed 1 year ago

shayneoneill commented 1 year ago

Hi!

Spectacular work here folks. Is there any plan to release a more unified dataset, as in rather than having to request every neuron on every layer, downloading a single monolithic file that could be, say, indexed in a database for searcheability, or whatever?

This would be very useful for guiding alignment efforts and generic research on how GPTs internal ontology works. (Ie loading the data into Neo4J and applying some good old fashion graph-theory number crunching to try and work out whats up with the nodes GPT4 couldnt make heads and tails of (Ie are they part of the deep structure of its linguistic thinking, are they secondary nodes to superpositions, etc. My intuition tells me these are solveable)

WuTheFWasThat commented 1 year ago

we're not planning to do this but anyone is feel free to try such things! agree it could be exciting

diziet commented 1 year ago

Jeff @WuTheFWasThat , https://openaipublic.blob.core.windows.net/neuron-explainer/neuron-viewer/index.html#/layers/31/neurons/1594 is missing the explanations for this neuron:

https://openaipublic.blob.core.windows.net/neuron-explainer/data/explanations/31/1593.jsonl

WuTheFWasThat commented 1 year ago

whoops, good to see you alex. not sure what went wrong but we likely won't fix