DavidUdell / sparse_circuit_discovery

Circuit discovery in GPT-2 small, using sparse autoencoding
MIT License
7 stars 1 forks source link

Drop the cross-referencing of cached features; plot those with logits only, if need be. #85

Closed DavidUdell closed 7 months ago

DavidUdell commented 7 months ago

It looks like the most preferred neurons are generally off-distribution.

DavidUdell commented 7 months ago

Do this generally, I think: wherever there's an intersection op in place, substitute an operation that includes the relevant data when available, but otherwise simply continues without it.