A sample script showing how to take a HF dataset, disaggregate it, run an Evaluator on it, and display the resulting evaluation metrics disaggregated by a combination of disaggregation modules.
Ideally this should be built without having to fork Evaluate. If a custom Evaluator is required, it can be added to this repo and the evaluate dependency can be added as an extra (optional) dependency to this library.
Required:
evaluate
dependency can be added as an extra (optional) dependency to this library.