MLBazaar / BTB

A simple, extensible library for developing AutoML systems
https://mlbazaar.github.io/BTB/
MIT License
172 stars 41 forks source link

Some benchmarking datasets are too easy. #199

Open pvk-developer opened 4 years ago

pvk-developer commented 4 years ago

In our collection of datasets that we use for our benchmarking process, we find some datasets that do not represent a hard problem and exhibit the following properties when solving them:

We still keep these datasets in the our benchmarking process and do test them every release. These datasets essentially provide us a testing possibility. That is in future we will incorporate a test that looks from changes or deviations from these expected results and if we see significant difference, we will have to examine what changed in the release. These anomalies, which can occur both in our tuners and in the tuners of an external library, help us detect if there have been any changes to the code or to an external library used by the tuners.

We also use them to know if we have done an integration of another library correctly, as we are expecting the same amount of ties with the others on 100 iterations.

We will formalize this process in the near future.