Closed gengkunling closed 6 years ago
Hi @gengkunling - I pushed a fix, will merge in as soon as tests pass.
More broadly, these matchers are all subclasses of snorkel.matchers.RegexMatchEach
, which matches a specified attribute (e.g. the NER tags) of each token against a supplied regex. So in your case, for example, you could instead just use:
org_matcher = RegexMatchEach(rgx='ORG', attrib='ner_tags')
Hope this helps! Alex
This should be fixed? If not re-open
I was trying to build an organization matcher using "snorkel.matchers.OrganizationMatcher". However, the matcher seems only works for Stanford CoreNLP, but not Spacy.
I looked into the code and found that:
It does not work for SpaCy because for the NER results, spaCy returns "ORG" rather than ''ORGANIZATION''.
Similar issues for other matchers. Is there a simple way to fix this?