In some of our XHTML 1.0 Transitional files the XML declaration is missing. As a result, JHOVE reports HTML-HUL-16 ("Unrecognized or missing DOCTYPE declaration; validation continuing as HTML 3.2"). If I manually add the XML declaration, the document is processed as XHTML (by the XML module) and JHOVE e.g. correctly finds an unclosed tag somewhere in the document.
According to the XHTML specifications, "An XML declaration is not required in all XML documents" (https://www.w3.org/TR/xhtml1/normative.html). For XHML 1.1 the XML declaration is also a 'SHOULD' have, not a 'MUST'. It seems that JHOVE expects that there always is an XML declaration.
Could this please be fixed, so that JHOVE correctly processes XHML files without an XML declaration?
Documents
C:\Temp\example.htm
Module
HTML-hul
Release: 1.4.2
Date: 22-apr-2022
RepInfo
URI: C:\Temp\example.htm
LastModified: Mon Mar 04 15:45:26 CET 2024
Size: 534
Format: HTML
Status: Not well-formed
Messages
ErrorMessage: Onherkend of ontbrekende DOCTYPE declaratie; validatie wordt verder gezet als HTML 3.2
ID: HTML-HUL-16
InfoMessage: This HTML version is currently not supported, falling back to HTML 3.2
ID: NO-ID
ErrorMessage: Ongedefinieerd attribuut voor element
ID: HTML-HUL-7
SubMessage: Name = html, Attribute = xmlns, Line = 2, Column = 7
ErrorMessage: De constructie met "/>" is onjuist, behalve in XHTML
ID: NO-ID
SubMessage: Name = meta, Line = 4, Column = 10
ErrorMessage: De constructie met "/>" is onjuist, behalve in XHTML
ID: NO-ID
SubMessage: Name = link, Line = 6, Column = 10
MimeType: text/html
Documents
C:\Temp\example_with_XML_declaration.htm
Module
XML-hul
Release: 1.5.2
Date: 22-apr-2022
RepInfo
URI: C:\Temp\example_with_XML_declaration.htm
LastModified: Mon Mar 04 15:48:22 CET 2024
Size: 574
Format: XML
Status: Not well-formed
SignatureMatches
XML-hul
Messages
ErrorMessage: SAXParseException
ID: XML-HUL-1
SubMessage: The element type "link" must be terminated by the matching end-tag "". Line = 8, Column = 7.
MimeType: text/xml
In some of our XHTML 1.0 Transitional files the XML declaration is missing. As a result, JHOVE reports HTML-HUL-16 ("Unrecognized or missing DOCTYPE declaration; validation continuing as HTML 3.2"). If I manually add the XML declaration, the document is processed as XHTML (by the XML module) and JHOVE e.g. correctly finds an unclosed tag somewhere in the document.
According to the XHTML specifications, "An XML declaration is not required in all XML documents" (https://www.w3.org/TR/xhtml1/normative.html). For XHML 1.1 the XML declaration is also a 'SHOULD' have, not a 'MUST'. It seems that JHOVE expects that there always is an XML declaration.
Could this please be fixed, so that JHOVE correctly processes XHML files without an XML declaration?
Example of problem (edit to see all markup):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
JHOVE 1.26.1 output (Dutch):
Documents C:\Temp\example.htm Module HTML-hul Release: 1.4.2 Date: 22-apr-2022 RepInfo URI: C:\Temp\example.htm LastModified: Mon Mar 04 15:45:26 CET 2024 Size: 534 Format: HTML Status: Not well-formed Messages ErrorMessage: Onherkend of ontbrekende DOCTYPE declaratie; validatie wordt verder gezet als HTML 3.2 ID: HTML-HUL-16 InfoMessage: This HTML version is currently not supported, falling back to HTML 3.2 ID: NO-ID ErrorMessage: Ongedefinieerd attribuut voor element ID: HTML-HUL-7 SubMessage: Name = html, Attribute = xmlns, Line = 2, Column = 7 ErrorMessage: De constructie met "/>" is onjuist, behalve in XHTML ID: NO-ID SubMessage: Name = meta, Line = 4, Column = 10 ErrorMessage: De constructie met "/>" is onjuist, behalve in XHTML ID: NO-ID SubMessage: Name = link, Line = 6, Column = 10 MimeType: text/html
Example with XML declaration added:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
JHOVE 1.26.1 output (Dutch):
Documents C:\Temp\example_with_XML_declaration.htm Module XML-hul Release: 1.5.2 Date: 22-apr-2022 RepInfo URI: C:\Temp\example_with_XML_declaration.htm LastModified: Mon Mar 04 15:48:22 CET 2024 Size: 574 Format: XML Status: Not well-formed SignatureMatches XML-hul Messages ErrorMessage: SAXParseException ID: XML-HUL-1 SubMessage: The element type "link" must be terminated by the matching end-tag "". Line = 8, Column = 7. MimeType: text/xml