EleutherAI / elk

Keeping language models honest by directly eliciting knowledge encoded in their activations.
MIT License
181 stars 33 forks source link

Revert "Multi datasets" #158

Closed lauritowal closed 1 year ago

lauritowal commented 1 year ago

Reverts EleutherAI/elk#123

Sorry, I misread Noras previous message, somehow. Will revert the pull-request for now, and we might want to look better into a possible performance regression first.

Update: Before reverting, we will test if it actually the changes caused performance regression on GPT2... If not we leave it as it is.

AlexTMallen commented 1 year ago

Where have we observed performance regression? Does this mean VINC auroc decreases because of #123?

lauritowal commented 1 year ago

@AlexTMallen yeah, when testing the changes of the pull request https://github.com/EleutherAI/elk/pull/123 pull I noticed some VINC auroc decrease in GPT-2 (especially in the last layer).

However, I wanted to be sure and look into it again.... But, I wasn't able to test it again today, I tried elk elicit gpt2 imdb --num_gpus 2 a few times, but the machine ending in .195 keeps freezing for me at some point and the one ending in .37 is down right now. I also wanted to try out some bigger models, like Deberta. I might try this on a different machine, tomorrow. If you can / want you can have a look at it too and see if the performance before merging the pull request is similar to the one after merging it for some models (maybe GPT-2, GPT-2 and Deberta...) and post the results here... If they are quite similar, we can decide to just close this revert pull request.

lauritowal commented 1 year ago

Thanks for testing that Alex!