EleutherAI / lm-scope

4 stars 1 forks source link

add lm-eval-harness #10

Open jmerizia opened 2 years ago

jmerizia commented 2 years ago

There are many good benchmarks in lm-eval-harness [0] that have been verified to work well on GPT-J, so this is a good data source for visualization purposes.

[0] https://github.com/EleutherAI/lm-evaluation-harness

jmerizia commented 2 years ago

One consideration is that some prompts in the harness might be long. So we might have to save the prompts as a sparse matrix, pruning values below a threshold (many values are near 0 anyways).