This updates the evaluate benchmark CLI to use the new API, simplifying the script a bit.
@pfliu-nlp : I'm making this a draft because there is no integration test testing the evaluate_benchmark CLI. I think it'd be a good idea to add one. Would you mind adding a simple one?
Blocked by https://github.com/neulab/explainaboard_client/pull/33
This updates the evaluate benchmark CLI to use the new API, simplifying the script a bit.
@pfliu-nlp : I'm making this a draft because there is no integration test testing the
evaluate_benchmark
CLI. I think it'd be a good idea to add one. Would you mind adding a simple one?