writer / replaCy

spaCy match and replace, maintaining conjugation
https://pypi.org/project/replacy/
MIT License
34 stars 8 forks source link

Eng 7251 support multiple whitespaces #27

Closed melisa-writer closed 4 years ago

melisa-writer commented 4 years ago

ENG-7251 Allow matching tokens separated by multiple whitespaces

They may appear after normalizing nonstandard whitespaces ex. "Here␣is␣a\u180E\u200Bproblem." -> "Here␣is␣a␣␣problem."

pattern can be preceded and followed by whitespace tokens to keep preceded_by... and succeeded_by... match hooks working

edit: support passing both text and doc (avoiding double conversion)