lukeleppan / better-word-count

Counts the words of selected text in the editor.
MIT License
263 stars 41 forks source link

Match words using unicode property matching #89

Open mgmeyers opened 1 year ago

mgmeyers commented 1 year ago

Feel free to close this if it's not in line with your vision of things, but this seems to be an effective solve for #55.

This uses unicode property matching to simplify word matching (see https://javascript.info/regexp-unicode)

/P{Z}* -> Matches 0 or more of anything that is not a separator (eg, space, tab, newline) [\p{L}\p{N}] -> Matches anything that is a letter or a number

This is based on @jon-heard 's algorithm, and defines a word as anything that contains at least one letter or a number (as defined by the host language) and no spaces/tabs/etc...

rzfzr commented 8 months ago

@lukeleppan Is anybody else accepting prs?