snorkel-team / snorkel

A system for quickly generating training data with weak supervision
https://snorkel.org
Apache License 2.0
5.81k stars 857 forks source link

Include text label names in LFAnalysis.lf_summary(label_names=[]) #1549

Closed rjurney closed 4 years ago

rjurney commented 4 years ago

Is your feature request related to a problem? Please describe.

I can't read the polarity field in the LFAnalysis.lf_summar() table. It means nothing to me. I'm a human, I read text labels not numeric ones.

Isn't this one much easier to read wh the labels printed out?

image

Describe the solution you'd like

I want an argument named label_names added to LFAnalysis.lf_summary that maps numeric labels to label names that are displayed in the table. I achieve this with the following code:

from snorkel.labeling import LFAnalysis, PandasLFApplier

ABSTAIN   = -1
GENERAL   = 0
API       = 1
EDUCATION = 2

label_pairs = [
    (ABSTAIN, 'ABSTAIN'),
    (GENERAL, 'GENERAL'),
    (API, 'API'),
    (EDUCATION, 'EDUCATION'),
]

# Forward and reverse indexes to labels/names
number_to_name_dict = dict(label_pairs)
name_to_number_dict = dict([(x[1],x[0]) for x in label_pairs])

# Some code that prepares LFs in a list called: lfs
# ...

# Prepare a name/label DataFrame to join to the LF Summary DataFrame below
lf_names = [lf.name for lf in lfs]
lf_labels = [lf._resources['label'] for lf in lfs]
lf_label_names = [{'Labels': [number_to_name_dict[l]]} for l in lf_labels]
label_name_df = pd.DataFrame(lf_label_names, index=lf_names)

# Apply the labels to the data
applier  = PandasLFApplier(lfs=lfs)
L_train  = applier.apply(df=df_train)
L_test   = applier.apply(df=df_test)

# Analyze the results
lfa = LFAnalysis(L=L_test, lfs=lfs)
lfs_df = lfa.lf_summary(Y=y_test)

# Join the label names into the table
lfs_df.join(label_name_df)

This could be done internally by LFAnalysis.lf_summary using a label_names argument which could be a numerically keyed dict or a sorted list corresponding to 0 and up. I don't think we would need to include -1 for ABSTAIN.

Describe alternatives you've considered

Today I thought about filling my apartment with plastic balls, crawling underneath them and taking a long nap. Instead I went with this.

Additional context

I'm serious about the ball thing.

ajratner commented 4 years ago

@rjurney thanks for suggesting! +1 to this as a useful feature. Current team a bit spread thin, but if you want to submit a PR definitely do!

rjurney commented 4 years ago

Ok, I'll try. Thanks!

github-actions[bot] commented 4 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.