EleutherAI / elk

Keeping language models honest by directly eliciting knowledge encoded in their activations.
MIT License
178 stars 33 forks source link

Use spectral attribute removal for normalization #233

Closed norabelrose closed 1 year ago

norabelrose commented 1 year ago

Too tired to explain what this is right now, will add a description tomorrow

norabelrose commented 1 year ago

Some bizarre nondeterministic test failure occurring right now