itplr-kosit / validator

Validates XML documents with XML Schema and Schematron
Apache License 2.0
80 stars 42 forks source link

Accent support #92

Closed RedFish closed 2 years ago

RedFish commented 2 years ago

Hello,

First, thank you for this repo and your work, it saves me a lot of time ❤️

I"m trying to validate a Peppol XML file with this validator, my file contain accents : <cbc:RegistrationName>Hélène </cbc:RegistrationName>

The "è" character is raising the following error : IOException while reading resource supplied_instance_42: org.xml.sax.SAXParseException; systemId: file:///some/path/supplied_instance_42; lineNumber: 263; columnNumber: 10; XML document structures must start and end in the same entity.

Full error message : { "rep:validationStepResult": { "rep:message": { "#text": "IOException while reading resource supplied_instance_15: org.xml.sax.SAXParseException; systemId: file:///Users/richard/Documents/Projets/escentiel/supplied_instance_15; lineNumber: 258; columnNumber: 10; Les structures de document XML doivent commencer et se terminer dans la même entité.", "#_id": "val-xml.1", "#_level": "error", "#_code": "generic-error" }, "#_id": "val-xml", "#_valid": "false" } }

PS: I'm using this validator with javascript, that's why the response above is in json format

Is there any solution / workaround for this bug ?

apenski commented 2 years ago

Hi @RedFish

glad to hear that the validator is useful in Peppol context too 😄

Regarding your accent problem, I think there must be something wrong with your input file. I tried simple accents by just pasting your Hélène into the simple.xml used by the simple happy test case with no changed behaviour e.g. test is still green.

So a simple accent isn't the problem. Maybe an encoding problem? Can you create a simple testcase like the one above to reproduce the problem? Do you have stacktrace to better locate the problem?

RedFish commented 2 years ago

My bad, I was sending the file with a wrong "Content-Length" header... Sorry for the inconvenience