snorkel-team / snorkel

A system for quickly generating training data with weak supervision
https://snorkel.org
Apache License 2.0
5.81k stars 857 forks source link

How do i handle a label which groups together UNCATEGORISED or OTHERS records? #1554

Closed srimugunthan closed 4 years ago

srimugunthan commented 4 years ago

We have 10 labels of which categorises the text records. Some records dont fall into the 10 categories. We just label them as "OTHERS".

If i am using snorkel to automatically label the 11 labels for the input text records, should i be creating a labelling function for "OTHERS"? I have regular expressions for each of the 10 labels. So i can indeed write a labelling function for "OTHERS" which negates the regular expressions of the rest of the 10 labels. But this seems to go against the idea of snorkel's ABSTAIN.

So i am wondering what is the best way to handle "OTHERS" label? appreciate any inputs?

paroma commented 4 years ago

A similar question related to the OTHER class is discussed here. Feel free to reopen if you have other questions!