mozilla-ai / lm-buddy

Your buddy in the (L)LM space.

Apache License 2.0

63 stars 3 forks source link

Open binaryaaron opened 3 months ago

binaryaaron commented 3 months ago

114 changes the `hellaswag` task to `GLUE` to get around the flakey test issue.

https://github.com/EleutherAI/lm-evaluation-harness/pull/2029 addresses hellaswag specifically, though it's not included in a release of lm-eval yet.

we need to ensure lm-buddy can pass the arg to lm-eval as well as hf models/datasets.