train probe per prompt - Githubissues

Solves NOT-291

This is quite a complex change, but this basically aims to train a reporter model per prompt, then evaluate it both on each individual prompt as well as with the mean credence. I should probably add on some tests for the new file structure as well.

the new flag is --probe_per_prompt added on Run

To test you can do like elk elicit gpt2 imdb --num_gpus 2 --probe_per_prompt with and without the flag. elk eval should also work.

EleutherAI / elk

train probe per prompt #271