`whole_word` flag - Githubissues

LeaVerou / brep

Write batch find & replace scripts that transform files with a simple human-readable syntax

13 stars 0 forks source link

To truly emulate text editors' find & replace, we also need a whole_word flag. However, this is not as trivial as wrapping the regex with \b(?: ... )\b. Word boundaries (\b) detect transitions from \w to \W (and vice versa). However when the match already starts or ends with a non-word character, it’s already a "whole word" match.

We probably need some lookarounds instead:

(?<=^|\W) = preceded by beginning of line/file OR non-word character
(?=$|\W) = followed by end of line/file OR non-word character

Furthermore, \W is not unicode aware and treats any non latin letter as a non-word character. For a unicode aware version, I think we need [^_\p{L}\p{N}]

LeaVerou / brep

`whole_word` flag #12