EleutherAI / elk

Keeping language models honest by directly eliciting knowledge encoded in their activations.
MIT License
178 stars 33 forks source link

CCS results now as expected #232

Closed norabelrose closed 1 year ago

norabelrose commented 1 year ago

Before we were (embarrassingly) not applying normalization to hidden states before running evaluation, now we are doing that.

TODO: Do the same thing for VINC eval