snorkel-team / snorkel

A system for quickly generating training data with weak supervision
https://snorkel.org
Apache License 2.0
5.81k stars 857 forks source link

Value of mu for a particular LF depends on 3 LFs only or on all LFs? #1678

Closed mayankgoyal1993 closed 2 years ago

mayankgoyal1993 commented 3 years ago

Why does the value of mu for a particular LF change as we keep adding more LFs, given the value of mu for a particular LFs should not depend on other LF if a minimum of 3 LFs conditions is satisfied?

rsmith49 commented 3 years ago

Hi @mayankgoyal1993! The label model uses all available LFs for the estimation of mu (3 is merely the minimum required number, not an upper limit on how much information the label model will use). Hope that helps!

mayankgoyal1993 commented 3 years ago

Thanks for the response, but does that mean the equation has more than one solution that we are solving for all available LFs as only solving 3 LFs can also give the solution and the less LFs equations remain same when we use high number of LFs.

humzaiqbal commented 2 years ago

Hi @mayankgoyal1993 , You can think of this as similar to solving an overdetermined linear system. While we can use fewer equations to solve, using all available equations will reduce the noise the most. Further approaches have studied just using three equations at a time. More information here: http://cs229.stanford.edu/notes2019fall/weak_supervision_notes.pdf

Hope this helps!

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.