We found that in multiple projects we had duplicate code for using spaCy’s blazing fast matcher to do the same thing: Match-Replace-Grammaticalize. So we wrote replaCy!
spacy >= 2.0
(not installed by default, but replaCy needs to be instantiated with an nlp
object)pip install replacy
from replacy import ReplaceMatcher
from replacy.db import load_json
import spacy
match_dict = load_json('/path/to/your/match/dict.json')
# load nlp spacy model of your choice
nlp = spacy.load("en_core_web_sm")
rmatcher = ReplaceMatcher(nlp, match_dict=match_dict)
# get inflected suggestions
# look up the first suggestion
span = rmatcher("She extracts revenge.")[0]
span._.suggestions
# >>> ['exacts']
ReplaceMatcher accepts both text and spaCy doc.
# text is ok
span = r_matcher("She extracts revenge.")[0]
# doc is ok too
doc = nlp("She extracts revenge.")
span = r_matcher(doc)[0]
Here is a minimal match_dict.json
:
{
"extract-revenge": {
"patterns": [
{
"LEMMA": "extract",
"TEMPLATE_ID": 1
}
],
"suggestions": [
[
{
"TEXT": "exact",
"FROM_TEMPLATE_ID": 1
}
]
],
"match_hook": [
{
"name": "succeeded_by_phrase",
"args": "revenge",
"match_if_predicate_is": true
}
],
"test": {
"positive": [
"And at the same time extract revenge on those he so despises?",
"Watch as Tampa Bay extracts revenge against his former Los Angeles Rams team."
],
"negative": ["Mother flavours her custards with lemon extract."]
}
}
}
For more information how to compose match_dict
see our wiki:
If you use replaCy in your research, please cite with the following BibText
@misc{havens2019replacy,
title = {SpaCy match and replace, maintaining conjugation},
author = {Sam Havens, Aneta Stal, and Manhal Daaboul},
url = {https://github.com/Qordobacode/replaCy},
year = {2019}
}