SmartDataAnalytics / SML-Bench

A Benchmark for Machine Learning from Structured Data
Apache License 2.0
21 stars 4 forks source link

How to run a learning system with different settings on one learning problem #9

Closed patrickwestphal closed 5 years ago

patrickwestphal commented 8 years ago

Conceptually we currently allow exactly one learning system to be executed on one learning problem in one benchmark run. To compare different settings of one learning system, e.g. compare the OCEL and CELOE algorithm of the DL-Learner there should be a means to run one learning system several times on one learning problem using different configurations. Technically the benchmark runner would have to consider multiple configuration files in a learning problem folder, e.g. dllearner1.conf, dllearner2.conf, dllearner3.conf. However which of them to apply should be determined in the overall benchmark configuration where we currently just define which learning systems to run on which learning problems, e.g.

learningsystems = aleph, dllearner, funclog, golem, progol, progolem, toplog
scenarios = pyrimidine/1

We thus have to extend the configuration format, allowing to explicitly select e.g. the dllearner2.conf settings. This could be achieved, e.g. by considering a DL-Learner instance with config dllearner1.conf as a 'different' learning system from a DL-Learner instance with dllearner2.conf settings. Thus a (IMHO) intuitive extension of the overall benchmark configuration (expressing that, besides the other learning systems, the DL-Learner should be executed with the configs dllearner1.conf and dllearner2.conf) could look like this:

learningsystems = aleph, dllearner-1, dllearner-2, funclog, golem, progol, progolem, toplog
scenarios = pyrimidine/1, carcinogenesis/1

However this might cause problems e.g. in case there are multiple config files for pyrimidine/1 but just one DL-Learner config file for the carcinogenesis/1 learning problem. One could then fall back to just run the DL-Learner (with the only configuration file found) once.

SimonBin commented 8 years ago

my suggestion would have been to configure the learningsystems, for example

learningsystem.dllearner-2.type = dllearner
learningsystem.dllearner-2.algorithm = ocel
...
patrickwestphal commented 8 years ago

Outcome of the internal discussion: We should implement both approaches.