[label bot] Can we take advantage of negative examples

jlewi commented 4 years ago

Forked from: https://github.com/microsoft/vscode-github-triage-actions/issues/5#issuecomment-628312837

From @hamelsmu

I know you didn't ask me the question but I can try to answer_ the fact that you initially had the wrong label, but you have the correct label now doesn't seem like something you would handle differently from the main case, (aside from this suggesting that this is a much harder example for your model to classify, but would have to look at those examples to determine this).

So here's what I'm thinking. Suppose we have an issue which doesn't have a particular label e.g "platform/gcp". The absence of that label could mean one of two things.

The issue was never labeled
The label doesn't apply to that issue

During training we would like to distinguish between these two cases. If a user explicitly removes a label from an issue then that gives me a high confidence signal that issue is a negative example of that label.

If the issue just doesn't have that label then it could be an unlabeled example or it could be a negative example. Hard to say.

Is there some way to weight the negative examples more?

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the labels:

Label	Probability
kind/feature	0.71
area/engprod	0.51

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

kf-label-bot-dev[bot] commented 4 years ago

Issue Label Bot is not confident enough to auto-label this issue. See dashboard for more details.

hamelsmu commented 4 years ago

Some general thoughts

My intuition says that you should hand label 100 examples and get a sense for the quality of existing labels in the dataset. Fundamentally, if you believe that many examples are labeled incompletely, this introduces a tremendous amount of noise to your models necessitating more data to overcome. I don't believe you have much training data on the scale of Kubeflow.
If you really wanted to try this, you could try the label smoothing technique I mentioned earlier:

If you don't have the right label and only the wrong label, I would suggest going through and hand labeling those if possible. If this is not feasible you could explore some variation of label smoothing but I'm not sure this will work and is more of a wild idea to give "a slightly stronger negative label" to the known negatives (example 0.1) vs. the other classes for which the label is unknown (example 0.15). This is a wild idea but is the best thing I can think of to take advantage of this "negative information". P.S. I haven't done a literature search for this, it is quite possible someone has tried this

Label smoothing would dictate your labels are something like this

Label is not present but not sure its because of an omission = 0.3 (Not that confident that label is 0, but closer to 0 than 1)
Label is present = 0.9 (Really confident that label is 1)
Label was removed = 0.05 (Really confident that label is 0)

So instead of 0/1 labels you have labels that represent the "confidence" you have in the presence of that class. You can still use cross entropy loss. You may not be able to use AutoML for this.

However I am not sure this will even work given the number of classes and the small size of your dataset. The properties of the problem you are trying to solve make me feel like this approach will fall flat. This seems like a case where we are not confident that the supervision we are providing to the model is correct. Therefore, there are other approaches you can take as well. Its hard to say, but I think it is important to hand label 100 examples to develop some intuition on how noisy your labels are, first as well as develop some intuition on how ML could solve this problem. TLDR; the best first step for training an algorithm is to "be the algorithm" for a little bit.

Hope that helps

kubeflow / code-intelligence

[label bot] Can we take advantage of negative examples #140