Open AikenBM opened 12 months ago
Very excellent bug report, I'll check it out, thanks.
I want to add, that schema validation within xml notepad below 0x20 is not correct or seems not to fully work as been defined here: https://www.w3.org/TR/2006/REC-xml-20060816/#NT-Char Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] / any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. /
i had a file in which 0xE; was found:
26 23 78 45 3B -> https://en.wikipedia.org/wiki/Shift_Out_and_Shift_In_characters
but the character 0x1F; (US = Unit Separator) was not found invalid in the same file
26 23 78 31 46 3B -> https://en.wikipedia.org/wiki/C0_and_C1_control_codes#Field_separators
I hope this helps fixing the schema validation
I'm using XML Notepad 2.9.0.5 on Windows 10 Enterprise 22H2 19045.3208.
I've discovered that XML Notepad's validation gives up around the 20 MB mark. The application will end with displaying a line number a column number of
0
in the error list. The program will then stop validating any further schema errors.I am attaching a zipped 42 MB XML file that contains 100 schema validation errors. This example uses sample data from the state of Michigan's Department of Education state reporting system because that's what I was doing when I found the problem.
SchemaErrors.zip
XML Notepad validates and identifies the first 43 or so errors, but the last one listed doesn't appear to populate the table the same way, and it stops after that error. Even exporting the list doesn't show the remaining errors. Here's a screenshot of the error list:
Based on my incidental testing, any schema validation error after roughly the 20 millionth character or 565,000th line in the file will fail in this way. Only the first error in that range will show, in the error list, and the error list will not display accurate information. I don't see any setting in the application's options to increase this apparent limitation.
I've also included the following Powershell function below which uses
System.Xml.XmlReader
to validate the schema, and it correctly identifies all 100 schema errors.Note that I typically use Powershell v7.3 with the above function. I'm not sure if it still works with Windows Powershell v5.1.