Closed rbiswas4 closed 6 years ago
I haven't done the deepest literature search (I'd give you links, but it's really just everything on the first page of Google search results for "criticism of auc roc") but think the idea is that all thresholds are considered to get a final result. Or do you mean different thresholds for different classes? I've been wondering about that myself, as it's related to how to treat classes differently with weights or something.
The link is where one is directed from the ROC page here.
The implementations I saw on Kaggle for the multi-class log-loss used what was more or less an average with equal weight to each object that doesn't really account for hierarchy or covariance, but you're right that it doesn't necessarily have to be that way. I think your idea of a customized objective function is spot-on and very much the right direction to be thinking in.
Maybe I'm being dense, but in what way is the Brier score specific to time-dependent data?
Yes, the issue is effective extrapolation when there are too few objects. (Some attention is given to this matter in the "literature search" results mentioned above, in the context of medicine.)
I actually meant something closer to the classification problem and farther from the science. Deterministic metrics would include those derived from accuracy, precision, recall, or a confusion matrix (like the Matthews Correlation Coefficient). (And then there's this one that I'm having trouble categorizing.) Varying a threshold over probabilistic classifications (as in the ROC/AUC) could be applied to other deterministic metrics to obtain a novel probabilistic metric. I'm hoping to get a feel for how such a thing would behave. As for science metrics, I think it would be hard to use any that didn't prioritize one science goal over others or make too many assumptions, so we might want to stay away from that.
I'm going to close this since a lot of the questions were settled with #3.
I went through your notebook outline which has compiled a large number of metrics, many of I was totally unaware of. Thanks Alex for setting this up to bring other alternatives into the discussion!
had the following questions: