Conflict vs Overlaps, Correct & Incorrect

snorkel-team / snorkel

A system for quickly generating training data with weak supervision

Apache License 2.0

5.81k stars 857 forks source link

Thanks for the question! As described in this tutorial, labeling functions can assign labels or abstain for each data point. This could be the reason that with 1000 data points in your dataset, only 950 of them received a label. Correct and incorrect are calculated over the datapoints that received a label from the labeling function and have a ground truth label associated with it.

Overlaps refer to labeling functions that label the same data point while conflicts refer to labeling functions that assign different labels to the same data point.

snorkel-team / snorkel

Conflict vs Overlaps, Correct & Incorrect #1546