SmartDataAnalytics / SML-Bench

A Benchmark for Machine Learning from Structured Data
Apache License 2.0
21 stars 4 forks source link

Measures for scoring classifiers #10

Closed giuseta closed 8 years ago

giuseta commented 8 years ago

Not all the classifiers are discrete, some of them yield an instance score or probability. In these cases the ROC and PR curves (and their areas) are usually used. Therefore I suggest that the configuration file produced by ./validate script should have a field that states the type of result (binary or score) and, in case it is a score type, a list of couples (<type_example>,<score>) should be given, where <type_example> could be + or - and <score> is obviously the numeric value. For example:

type: score
values: (
    (+,0.567),
    (+,0.465),
    (-,0.987),
    ...
)

Any suggestions/comments?

patrickwestphal commented 8 years ago

Recap of what we just discussed: Seems it makes more sense to put the validation result type into a learning system configuration as for example learningsystems/dllearner/system.ini such that the benchmark framework knows a priori what kind of validation result to expect.