Anders429 / word_filter

A Word Filter for filtering text.
Apache License 2.0
1 stars 0 forks source link

Reintroduce Repeated Match Mode #46

Closed Anders429 closed 3 years ago

Anders429 commented 3 years ago

Previously, construction of a WordFilter allowed for specifying a repeated match mode, defining whether repeated characters should be checked. Now that the repetitions are handled fully within the code generation in the upcoming version 0.6.0, this can technically be reintroduced in the code generation builder.

It is worth noting that the current repeated matching is actually allowing for repeated graphemes, rather than repeated characters. This should be reflected in whatever naming scheme is used.

An additional feature to consider are allowing for repeated separators. In this case, it may make sense to use a bitflag instead of an enum. Repeated character matching could also be implemented, which would only change the way repetitions are done within graphemes.

Anders429 commented 3 years ago

Punting this one to another release. I'd rather release 0.6.0 right now than wait for this feature.

Anders429 commented 3 years ago

This is on track for 0.8.0. Currently putting together an implementation to permit specifying repetition allowances in words, exceptions, and separators individually.

This also adds repetition support for separators (which had previously been supported, but was put aside due to lack of optimizations. #79 addresses the optimization issues, and is extended to separator repetitions).