Open foabodo opened 6 years ago
Good question @foabodo
@akundaje have you published examples of modeling ambiguous labels? I vaguely recall seeing an example of this not too long ago, but I don't remember whether it was from your group or someone else.
@foabodo this doesn't directly answer your question, but this is Anshul's paper I was thinking of:
Learning to Abstain via Curve Optimization https://arxiv.org/abs/1802.07024
@agitter Thank younfor digging this reference out so quickly. I will give it a read and close this issue if my curiosity is satisfied.
@agitter. I have digested @akundaje’s paper and understand why it does not directly address my question. Their idea is separately intereseting. At the same time, I can see how basic knowledge of statistics would lead anyone to infer that the statement in my original comment holds.
While it would be nice to have examples, I don’t object to letting this issue be closed.
@foabodo I agree it would be better to have specific examples or references supporting this statement. Our current maintainers can weigh in, but it makes sense to me to leave this issue open until a contributor can help with an example or reference.
That works for me too.
The technique for managing data labeling when some examples are "ambiguous" (i.e. near the threshold that divides positive and negative examples) presented in the Formulation of Classification Labels subsection of the Discussion section is related to my own work and of particular interest to me, but I observed that there is no reference to literature that can provide concrete examples of how "correlation between model predictions on [ambiguous] examples and their signal values can be used to evaluate if the model correctly ranks these examples between positive and negative examples."