BeRo1985 / flre

FLRE - Fast Light Regular Expressions - A fast light regular expression library
GNU Lesser General Public License v2.1
94 stars 23 forks source link

unicode line separator category matching #71

Closed benibela closed 3 years ago

benibela commented 3 years ago

These should match

 writeln(1 = TFLRE.Create('^(?:\p{Z}*)$', []).Find('  



')); //U+1680, U+3000, U+2028, U+2028, U+2029, U+2029
 writeln(1 = TFLRE.Create('^(?:\p{Zl}*)$', []).Find('
')); //U+2028

This one should not

 writeln(0 = TFLRE.Create('^(?:\P{Zl}*)$', []).Find('
')); //U+2028 with the inverted class
benibela commented 3 years ago

I forgot the [rfUTF8] flag

This works:

writeln(1 = TFLRE.Create('^(?:\p{Z})$', [rfUTF8]).Find('  



')); //U+1680, U+3000, U+2028, U+2028, U+2029, U+2029 writeln(1 = TFLRE.Create('^(?:\p{Zl})$', [rfUTF8]).Find('
')); //U+2028 writeln(0 = TFLRE.Create('^(?:\P{Zl}*)$', [rfUTF8]).Find('
')); //U+2028 with the inverted class

You can delete the issue