Closed gitgabrio closed 1 month ago
Thank you for reporting. Will we add such example into https://kiegroup.github.io/dmn-feel-handbook/#matches-input-pattern-flags @gitgabrio ?
Thank you for reporting. Will we add such example into https://kiegroup.github.io/dmn-feel-handbook/#matches-input-pattern-flags @gitgabrio ?
@jomarko TBH I do not know. As far as I can see, that page is used only to show the syntax of the different functions. Inside TCK (and also in our unit tests) there are lot of different examples/cases, and I'm not sure the dmn-feel-handbook is meant for that. @baldimir @yesamer wdyt ?
@gitgabrio I agree.
According to the DMN specs, the
matches()
andreplace()
functions should behave according to the xQuery 1.0 specification. Our current implementation doesn't rely on that, using the native Java implementation to manage both. Unfortunately, the Java regex management differs from the xQuery specs, and natively implementing the xQuery specs leads to huge effort (and complex) work. For that reason, we agreed to rely on an external library that already implemented xQuery specs in Java. After a quick analysis (https://en.wikipedia.org/wiki/XQuery#Implementations), we choose Saxon-He as the best fit for this purpose.The scope of the ticket is to integrate this new external dependency into our code base, so both
matches()
andreplace()
can rely on that implementation to correctly behave.=== Original description ====
TCK Tests revealed some cases where matches() wrongly behaves.
This syntax is invalid for java Pattern
[A-Z-[OI]]
and correct one is[A-Z&&[^OI]]
(see Character classes in docs)XML Schema Part 2: Datatypes Second Edition makes a distinction between negation (
ˆ
) and subtraction (-
) See "Negative Character Group" and "Character Class Subtraction" at regex chapter.A negative character group is a ·positive character group· preceded by the ^ character. For all ·positive character group·s P, ^P is a valid negative character group, and C(^P) contains all XML characters that are not in C(P).
A character class subtraction is a ·character class expression· subtracted from a ·positive character group· or ·negative character group·, using the - character.
A "translation" between the different syntax is required