morbac / xmltools

XML Tools plugin for Notepad++
GNU General Public License v3.0
257 stars 57 forks source link

Character class subtraction not recognized in schema regex. #195

Open quodvideo opened 1 year ago

quodvideo commented 1 year ago

Validating an XML document with a schema containing character class subtraction in a regex is failing even though the XML document matches the regex. Upon removing the character class subtraction from the regex, the document validates.

The part of the regex in question is [A-Z0-9-[AEIOU]]. The document validates when that is replaced with [A-Z0-9].

XML Tools Plugin version 3.1.1.13 unicode 64bit

XML engine: MSXML

shadedurza commented 1 year ago

Can confirm I can reproduce this. Attempting to validate a sample ACA IRS manifest file using xsd schema files provided by the IRS results in:

XML Validation error

ERROR - Line 9, pos 96: '00000000-0000-0000-0000-000000000000:SYS12:BB000::T' violates pattern constraint of '([0-9a-zA-Z]{8}-[0-9a-zA-Z]{4}-[0-9a-zA-Z]{4}-[0-9a-zA-Z]{4}-[0-9a-zA-Z]{12}:SYS12:[A-Z-[AEIOU]]{2}[A-Z0-9-[AEIOU]]{3}::T)'.
The element '{urn:us:gov:treasury:irs:ext:aca:air:ty22}UniqueTransmissionId' with value '00000000-0000-0000-0000-000000000000:SYS12:BB000::T' failed to parse.

Navigating to the relevant xsd document and removing -[AEIOU] from the pattern constraint solves the issue. I also confirmed the same file and xsd validate successfully in xmlspy.

image Notepad++ v8.4.9 (64-bit)