snorkel-team / snorkel

A system for quickly generating training data with weak supervision
https://snorkel.org
Apache License 2.0
5.81k stars 857 forks source link

When a token does not fall under any of the class labels or abstains in NER? #1692

Closed anjani-dhrangadhariya closed 2 years ago

anjani-dhrangadhariya commented 2 years ago

Issue description

I have a program that labels a sequence of words using ontologies. I have labeling functions for class negative, class positive, and class abstain. Should I keep the word unlabeled if it does not fall under any of these class labels and ignore it or should I force-label them under either class negative or abstain? I will be grateful for any hints or help.

humzaiqbal commented 2 years ago

Hi anjani-dhrangadhariya,

Thanks for reaching out! To clarify you explicitly want 'abstain' as a class as opposed to just considering cases where labeling functions don't vote to be abstaining correct?

If thats the case, I don't think you need to force-label in the case where no labeling functions vote on a given word. Probabilistic labels will still be generated anyway (the fact that no labeling functions voted is itself a signal the label model picks up on).

Hope this helps!

anjani-dhrangadhariya commented 2 years ago

Hi anjani-dhrangadhariya,

Thanks for reaching out! To clarify you explicitly want 'abstain' as a class as opposed to just considering cases where labeling functions don't vote to be abstaining correct?

If thats the case, I don't think you need to force-label in the case where no labeling functions vote on a given word. Probabilistic labels will still be generated anyway (the fact that no labeling functions voted is itself a signal the label model picks up on).

Hope this helps!

Thank you for your answer, Humza! I removed my previous response as it came from a different understanding. I am not using abstain class and just use abstain as a label where none of the labeling functions voted. The problem was that I did not have any negative label LFs. To make the label model work, I need to have labeling functions that emit non-zero labels {-1, 1}.