Open binaryaaron opened 3 months ago
hellaswag
GLUE
https://github.com/EleutherAI/lm-evaluation-harness/pull/2029 addresses hellaswag specifically, though it's not included in a release of lm-eval yet.
we need to ensure lm-buddy can pass the arg to lm-eval as well as hf models/datasets.
114 changes the
hellaswag
task toGLUE
to get around the flakey test issue.https://github.com/EleutherAI/lm-evaluation-harness/pull/2029 addresses hellaswag specifically, though it's not included in a release of lm-eval yet.
we need to ensure lm-buddy can pass the arg to lm-eval as well as hf models/datasets.