Closed Innixma closed 3 years ago
Another alternative is to report metrics in higher_is_better form always, and flip the sign to align them.
I think I would prefer this. As in #262 (which the screenshot is from) too many columns make the table hard to read, especially if it means each row in the table is converted in two (or more) lines on the terminal. Perhaps we can just prefix the column names with -
(e.g. acc
, auc
, -logloss
).
If it is acceptable for your system, I would also prefer the signs to be flipped as well, as it adds a great deal of consistency to the code and makes sorting much easier since the user doesn't have to both understand and use the higher_is_better
column. In terms of -logloss
, this is an interesting idea that I haven't thought of before and don't have a strong opinion on.
Example of the issue
Currently, depending on the metric a
result
value of0.6
compared to0.4
can be either better (accuracy, auc, r2, etc.) or worse (rmse, log_loss, mae, etc.). When generating aggregated analysis from the results, currently this knowledge has to be hardcoded by the user and is error prone.I'd like to propose adding a new column,
higher_is_better
. If the higher the result metric is, the better the score, thenhigher_is_better
should beTrue
or1
. If not, then it should beFalse
or0
.Another alternative is to report metrics in
higher_is_better
form always, and flip the sign to align them. This would cause alog_loss
of0.5
to be reported as-0.5
. This is the strategy several AutoML systems use such as AutoGluon and MLJAR, although can be confusing when interpreted by humans (but is great for computers).