Create Scoped Categoricals

ajratner commented 7 years ago

Create mapping in Snorkel front-end so that each Candidate (variable) can have its own support; should then require minimal changes to compiler (just changing cardinality) and none to NS!

Example: suppose that we have a closed-world entity-linking task, e.g. mapping disease mentions to MESH IDs. If we have a mention like "headache", we don't want to consider all possible IDs (something like 40,000 I think?) during learning / inference; instead we probably have a small list of IDs to consider (e.g. the set of all labels given by an LF).

@jason-fries for EL stuff!

Working on branch scoped_categoricals; to-dos:

[x] Remap label matrix in GenerativeModel class
[x] Update all references to cardinality in FG compilation
[x] Switch LF_acc_prior to LF_acc_prior_weight and debug
[x] Update all references to cardinality in GenerativeModel.learned_lf_stats
[x] Update all references to cardinality + remap to original values in GenerativeModel.marginals
[x] Add simple example to categoricals notebook
[x] Write tests!

ajratner commented 7 years ago

Notes:

Need to put the first loop over all label variables back in (because looping over sparse L won't cover the label=0 ones, which we still need to initialize!)

ajratner commented 7 years ago

Want to write one more test, then done for now

snorkel-team / snorkel

Create Scoped Categoricals #649