openml / OpenML

Open Machine Learning
https://openml.org
BSD 3-Clause "New" or "Revised" License
663 stars 90 forks source link

Mean and mean_weighted measures are not computed #627

Open hildeweerts opened 6 years ago

hildeweerts commented 6 years ago

For multi-class classification tasks, measures such _mean_weighted_area_under_roccurve, _meanprecision, and _mean_weightedprecision are not computed. Instead, _area_under_roccurve and precision are already weighted by class size. This is inconsistent with the evaluation measure descriptions (https://www.openml.org/search?type=measure) and also means that you cannot actually choose between e.g. weighted and unweighted precision. Is this intentional?

joaquinvanschoren commented 6 years ago

We return all the per-class scores, so can weight them how you want, e.g. see https://www.openml.org/r/8857838 -> you get 10 values for all class-specific measures.

Since the evaluation engine is based on WEKA, it automatically computes weighted AUC, Precision,... So, area_under_roc_curve will always be the weighted version. I agree that the current description is confusing and should be fixed asap.

You're right that we don't automatically compute the unweighted versions (I checked the database and we don't do this for new runs). We should fix that, back-compute for old runs and compute the for all new runs. At least, if we want to support the unweighted versions.

I double-checked the WEKA code: double aucTotal = 0; for(int i = 0; i < m_NumClasses; i++) { double temp = areaUnderROC(i); if (!Utils.isMissingValue(temp)) { aucTotal += (temp * classCounts[i]); } }

 return aucTotal / classCountSum;

So, yeah, weighted.

TL;DR: area_under_roc_curve is weighted, and always computed. The description must be fixed. We should add a mean_area_under_roc_curve to also have the unweighted version, and compute it automatically Same for precision, recall, f1,...

On Wed, 14 Feb 2018 at 23:43 Hilde notifications@github.com wrote:

For multi-class classification tasks, measures such mean_weighted_area_under_roc_curve, mean_precision, and mean_weighted_precision are not computed. Instead, area_under_roc_curve and precision are already weighted by class size. This is inconsistent with the evaluation measure descriptions ( https://www.openml.org/search?type=measure) and also means that you cannot actually choose between e.g. weighted and unweighted precision. Is this intentional?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/openml/OpenML/issues/627, or mute the thread https://github.com/notifications/unsubscribe-auth/ABpQV9RD1sbrA6wBHnxxmEEkCVecfUzLks5tU2F3gaJpZM4SGFp8 .

-- Thank you, Joaquin