EleutherAI / elk

Keeping language models honest by directly eliciting knowledge encoded in their activations.
MIT License
182 stars 33 forks source link

Remove check_separability #267

Closed norabelrose closed 1 year ago

norabelrose commented 1 year ago

This provably superfluous due to Theorem 3.1 in the LEACE paper. Any normalization scheme that makes sure the means are equal will max out the pseudo-AUROC.