This pr cleans up the benchmark interface. The interface now is a dataclass with attributes train_dataset, test_dataset, and metrics. Additionally, metrics are introduced which cover error and operator property metrics.
Which issue does this PR tackle?
The benchmark interface could not handle multiple metrics.
There was no consistent way of implementing and evaluating metrics.
How does it solve the problem?
Changes Benchmark interface.
Introduces Metric base class.
Introduces error metrics L1_error and MS_error.
Introduces operator metrics NumberOfParameters and SpeedOfEvaluation.
How are the changes tested?
WIP.
Checklist for Contributors
[ ] Scope: This PR tackles exactly one problem.
[ ] Conventions: The branch follows the feature/title-slug convention.
[ ] Conventions: The PR title follows the Bugfix: Title convention.
[ ] Coding style: The code passes all pre-commit hooks.
[ ] Documentation: All changes are well-documented.
[ ] Tests: New features are tested and all tests pass successfully.
[ ] Changelog: Updated CHANGELOG.md for new features or breaking changes.
[ ] Review: A suitable reviewer has been assigned.
Checklist for Reviewers:
[ ] The PR solves the issue it claims to solve and only this one.
[ ] Changes are tested sufficiently and all tests pass.
Cleanup: Benchmark Interface
Description
This pr cleans up the benchmark interface. The interface now is a dataclass with attributes
train_dataset
,test_dataset
, andmetrics
. Additionally, metrics are introduced which cover error and operator property metrics.Which issue does this PR tackle?
How does it solve the problem?
Metric
base class.L1_error
andMS_error
.NumberOfParameters
andSpeedOfEvaluation
.How are the changes tested?
Checklist for Contributors
feature/title-slug
convention.Bugfix: Title
convention.Checklist for Reviewers: