Create mapping in Snorkel front-end so that each Candidate (variable) can have its own support; should then require minimal changes to compiler (just changing cardinality) and none to NS!
Example: suppose that we have a closed-world entity-linking task, e.g. mapping disease mentions to MESH IDs. If we have a mention like "headache", we don't want to consider all possible IDs (something like 40,000 I think?) during learning / inference; instead we probably have a small list of IDs to consider (e.g. the set of all labels given by an LF).
@jason-fries for EL stuff!
Working on branch scoped_categoricals; to-dos:
[x] Remap label matrix in GenerativeModel class
[x] Update all references to cardinality in FG compilation
[x] Switch LF_acc_prior to LF_acc_prior_weight and debug
[x] Update all references to cardinality in GenerativeModel.learned_lf_stats
[x] Update all references to cardinality + remap to original values in GenerativeModel.marginals
Need to put the first loop over all label variables back in (because looping over sparse L won't cover the label=0 ones, which we still need to initialize!)
Create mapping in Snorkel front-end so that each
Candidate
(variable) can have its own support; should then require minimal changes to compiler (just changingcardinality
) and none to NS!Example: suppose that we have a closed-world entity-linking task, e.g. mapping disease mentions to MESH IDs. If we have a mention like "headache", we don't want to consider all possible IDs (something like 40,000 I think?) during learning / inference; instead we probably have a small list of IDs to consider (e.g. the set of all labels given by an LF).
@jason-fries for EL stuff!
Working on branch
scoped_categoricals
; to-dos:GenerativeModel
classcardinality
in FG compilationLF_acc_prior
toLF_acc_prior_weight
and debugcardinality
inGenerativeModel.learned_lf_stats
cardinality
+ remap to original values inGenerativeModel.marginals