snorkel-team / snorkel

A system for quickly generating training data with weak supervision
https://snorkel.org
Apache License 2.0
5.81k stars 857 forks source link

Negative Label Capability #1665

Closed ndenStanford closed 2 years ago

ndenStanford commented 3 years ago

Is your feature request related to a problem? Please describe.

A clear and concise description of what the problem is. I would love to write a leveling function that could produce a negative label. For instance

1 = label the dataset to belong to class 1 2 = label the dataset to belong to class 2 3 = label the dataset to belong to class 3 0 = abstain
-1 = label the dataset to NOT belong to class 1 -2 = label the dataset to NOT belong to class 2 -3 = label the dataset to NOT belong to class 3

As you can see, the label -1, -2, and -3 carry more information than a pure abstain label. Is there a way that I could achieve this in the current Snorkel framework?

Describe the solution you'd like

A clear and concise description of what you want to happen. I would love to have Snorkel supporting the negative label as explained above.

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered. I tried to put everything to abstain now. However, I am seeing poor performance on Labelmodel fitting.

I am seeing something like this in my experiment. 4 numbers below show the output from labeling functions and the arrow shows a result from Lablemodel prediction.

1, 0, 0, 0 --> 2

Clearly, there's something off here. However, if we have the capability to output something like this

1, 0, -2, -2

The model would know that it shouldn't really output 2 in the final model.

I appreciate your time and attention to this matter.

rsmith49 commented 3 years ago

Hi @ndenStanford! Unfortunately, we do not (currently) plan to support negative class labels from LF's. However, you may find this paper helpful. It explores the concept of "Partial Labelers", which I think is in line with what you are suggesting.

If you end up diving into the research and want to implement it as an enhancement here, we would love for you to submit a PR!

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.