snorkel-team / snorkel

A system for quickly generating training data with weak supervision
https://snorkel.org
Apache License 2.0
5.81k stars 857 forks source link

Overlap and conflict rate between all labeling functions (LFs) #1707

Closed gionanide closed 1 year ago

gionanide commented 2 years ago

Problem I Want To solve

Hello, I would like to thank you for your contribution and this project. First, I would like to check if my point of view is right. As far as I understand, if for example you have 5 labeling functions (LFs) each one of them will have 1 overlap and 1 conflict rate. I suppose that besides the fact that maybe all the LFs overlap between each other the only rate that is illustrated is the higher. The same for the conflict rate. it will be insightful and helpful to provide a matrix with 4 such rates for each function (if there are 5 LFs) and to know the pairwise overlap and conflict rates. For example, if there are 5 LFs, the conflict and overlap rates will be two matrices 5 x 5.

Describe the solution you'd like

The solution is the illustration of the aforementioned matrices. I suppose it will provide a better understanding of the interactions between the LFs.

Describe alternatives you've considered

Depends on the problem, sometimes only the higher rates of conflict and overlap, if the higher one is the only illustrated is enough.

Additional context

As far as now I have not seen any code of the Snorkel project, thus, I do not have an outline of an implementation. I just illustrated an idea for further discussion.

Thank you in advance, Manos.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

vkrishnamurthy11 commented 2 years ago

Hello @gionanide , I think the functionality that you are looking for is in this method, let us know if that solves your problem. https://github.com/snorkel-ai/strap/blob/303dc8b36327f1fac8afe4d237751044af2cc046/src/python/lf_analysis/metrics.py#L243

gionanide commented 2 years ago

Hello @vkrishnamurthy11 thank you for your answer. I think that I do not have access to the page/ link you sent me.

vkrishnamurthy11 commented 2 years ago

@gionanide My apologies. Can you please see if this link works: https://github.com/snorkel-ai/strap/blob/main/src/python/lf_analysis/metrics.py#L306

gionanide commented 2 years ago

@vkrishnamurthy11 I do not think that I have access in those links. Or maybe I am doing something wrong

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.