Closed rjurney closed 4 years ago
These are great suggestions!
Would you mind adding making these contributions, potentially as LF generators, in a pull request to https://github.com/snorkel-team/snorkel-zoo?
@vincentschen sure!
@rjurney thanks for this! Excited here!
Keyword Labeling Function Utilities
The Spam tutorial has the following code for creating Labeling Functions, which are the most common (and surprisingly powerful) type of LF:
There should be an interface for a KeyWordLabelingFunction that incorporates this capability, as it is the most common usage pattern.
Describe the solution you'd like
I have made improvements to this code to have the option of searching one or more fields, and for creating separate LFs per word:
Therefore I propose the method
snorkel.labeling.lf.nlp.keyword_labeling_function
with the following interface:If
separate=True
new LFs are created for each term, otherwise OR is used. The method could than a method=['or', 'and'] to enable multiple phrase matching.I don't know if this is the right interface but this seems the right method.
Describe alternatives you've considered
This is the only alternative I can think of without disruptive changes to the
LabelingFunction
interface.Additional context
I use this code enough that I'd be happy to write the patch.