We are ingesting many XML files that are classified by JHOVE as "not well-formed" although they are well-formed. Here is an example:
12745764.zip
These XML files were created by Abbyy Finereader. The contain an http link to a Schema. If "http" is changed into "https", the file becomes well-formed. Since the XML Version is not declared on top of the file, it is an XML 1.0. XML 1.0 does not require a Schema. If the schema location was wrong, it would perhaps invalid, but still well-formed.
JhoveView (Rel. 1.28.0, 2023-05-18)
Date: 2024-03-25 18:58:12 MEZ
RepresentationInformation: C:\Users\rsuri\Downloads\12745764.xml
ReportingModule: XML-hul, Rel. 1.5.3 (2023-03-16)
LastModified: 2024-03-25 18:40:00 MEZ
Size: 829103
Format: XML
Status: Not well-formed
SignatureMatches:
XML-hul
ErrorMessage: SAXParseException: Premature end of file. Line = -1, Column = -1.
ID: XML-HUL-1
MIMEtype: text/xml
We are ingesting many XML files that are classified by JHOVE as "not well-formed" although they are well-formed. Here is an example: 12745764.zip These XML files were created by Abbyy Finereader. The contain an http link to a Schema. If "http" is changed into "https", the file becomes well-formed. Since the XML Version is not declared on top of the file, it is an XML 1.0. XML 1.0 does not require a Schema. If the schema location was wrong, it would perhaps invalid, but still well-formed.
JhoveView (Rel. 1.28.0, 2023-05-18) Date: 2024-03-25 18:58:12 MEZ RepresentationInformation: C:\Users\rsuri\Downloads\12745764.xml ReportingModule: XML-hul, Rel. 1.5.3 (2023-03-16) LastModified: 2024-03-25 18:40:00 MEZ Size: 829103 Format: XML Status: Not well-formed SignatureMatches: XML-hul ErrorMessage: SAXParseException: Premature end of file. Line = -1, Column = -1. ID: XML-HUL-1 MIMEtype: text/xml