what are the `extract.matches` patterns analogous to `constants.POS_REGEX_PATTERNS` ?

chartbeat-labs / textacy

NLP, before and after spaCy

Other

2.21k stars 249 forks source link

Hi @joefromct , thanks for reporting. This constant was originally used by the pos_regex_matches() function, but that function has been deprecated for a while and (on the develop branch) removed entirely. So, I should remove it.

I don't think there's any way to exactly replicate this functionality using spaCy's Matcher, but you can get pretty close:

from spacy.matcher import Matcher
pattern = [
    {'POS': 'DET', 'OP': '?'},
    {'POS': 'NUM', 'OP': '*'},
    {'POS': 'ADJ', 'OP': '*'},
    {'POS': {'IN': ['NOUN', 'PROPN'], 'OP': '+'}
]
matcher = Matcher(nlp.vocab)
matcher.add("np", [pattern])

chartbeat-labs / textacy

what are the `extract.matches` patterns analogous to `constants.POS_REGEX_PATTERNS` ? #318

what's wrong?

relevant page or section