Open Pigeon-Barry opened 3 years ago
I can partially solve this issue, while throw an exception in cases where I cannot handle lookaround.
My idea is to handle those patterns, where lookaround pattern matches text that is shorter or equal in length for the part which is influenced by this lookaround.
For example:
(?!BG)[A-Z]{2}
the part under negative lookahead is 2 char long and the part that is influenced - [A-Z]{2}
is 2 chars long. I can handle it by retrying [A-Z]{2}
part unless it satisfies the restriction.
The same way I could handle (?!B)[A-Z]{2}
or (?!.X)[A-Z]{2}
.
Funny enough that I could also handle this pattern
Though that kind of pattern could be hard to handle
(?!X+)[A-Z]{2}[CDE]
Or, really I could go the easy way first - generate text and then verify that it matches with all lookaround things. if not - regenerate, if yes - then give it away to user. Brute-force, but easiest to implement. I can think about performance improvements for special cases later.
There is a general issue with lookaround patterns,
Whenever lookaround pattern part should influence another part of pattern (values that can be produced in another part of pattern) - it does not work correctly.
For example:
In this pattern lookahead part
(?!B)
influences[AB]
part by limiting number of valid values of[AB]
part. This should be supported.Original request text: