Closed djbpitt closed 3 years ago
Declare HTML namespace on root element and add:
declare option exist:serialize "method=xhtml5 indent=yes html-version=5.0 media-type=application/xhtml+xml";
Not documented officially, but see https://markmail.org/message/4ubfxqyeq2rp3tdw for discussion.
(Using the 2021-10-22 nightly build of 5.4)
If I have understood the correspondence at https://markmail.org/message/4ubfxqyeq2rp3tdw correctly, the only way to ask eXist-db to serve 1) XHTML5 with XML syntax and, 2) a doctype declaration that looks like <!DOCTYPE html>
, 3) the application/xhtml+xml
mime type, 4) an XML declaration, and 5) the HTML namespace is to declare the namespace on the root <html>
element (this much is expected) and use a legacy declaration:
declare option exist:serialize "method=xhtml5 indent=yes html-version=5.0 media-type=application/xhtml+xml omit-xml-declaration=no";
I do not need to save the result into the database; I just need to access the XQuery and return the result to a browser.
As far as I’ve been able to tell from the available documentation, it is not possible to return results like this with the non-legacy method of declaring options, although the correspondence I cite above seems to suggest that it should be possible by specifying the method as xhtml
(not xhtml5
) together with an html-version
of 5.0
and no public or system doctype. Perhaps more importantly, I cannot find any documentation (I did a full-text search for xhtml5 at http://exist-db.org/exist/apps/doc/search.html?q=xhtml5) for how to obtain this result using the legacy method. I think expected behavior is to be able to get output according to the five features listed above using the non-legacy method, which I guess might count as a feature request. But if I am correct in thinking that the legacy method is available and not scheduled for removal, but also not documented, that would seem to be a documentation error. Is there something constructive that I can do (I’m not a Java programmer) to help clear up the confusion? Or is this my misunderstanding, rather than a real issue?
8 replies
line0:seedling: 1 day ago
I am sorry @David, can you cite something that xhtml5 is a thing?
David 1 day ago
@line0 I don’t think xhtml5
is a thing. What I want is HTML5 (which is a thing) with XML syntax (also a thing) and the other features I described (doctype declaration, mime type, XML declaration—all things), but the only way I’ve been able to find to get that combination of features is to ask for a method called xhtml5
. As far as I can tell, there isn’t supposed to be such a method and I should be able to get the combination of features I describe by specifying xhtml
(which is a documented method) and an html-version
of 5.0
. But that doesn’t seem to work. Have I misunderstood?
line0:seedling: 9 hours ago
@David I just talked to @Joern Turner and we both agreed that while you can have well-formed html5 you are likely not allowed to add an XML-declaration at the beginning. And application/xhtml+xml
is also not compliant (actually never was interpreted correctly by any client I know of). (edited)
Tom Hillman 8 hours ago
Sorry to contradict, @line0, but you can certainly have both the XML declaration and the HTML5 doctype.
Tom Hillman 8 hours ago
Although I notice that the W3C validator erroneously identifies these as 'XML processing instructions' e.g. https://validator.w3.org/nu/?doc=https%3A%2F%2Fyamahito.github.io%2FSyrinscall%2Fdarkly.html
Tom Hillman 8 hours ago
I also note that the validator in question complains about some features necessary for XML compatibility such as "stray end tags"
David 7 hours ago
@line0 @Joern Turner Thank you; this is very helpful information. If you don’t mind, I’d be grateful if we could please explore some of the features that I mentioned in more detail:
application/xhtml+xml
is also not compliant (actually never was interpreted correctly by any client I know of).” Meanwhile, according to https://html.spec.whatwg.org/#html-vs-xhtml, “The second concrete syntax is XML. When a document is transmitted with an XML MIME type, such as application/xhtml+xml
, then it is treated as an XML document by web browsers, to be parsed by an XML processor.” If I am correct in thinking that the spec that I am quoting is authoritative, I don’t know how to understand what you mean when you write that it is “”not compliant”. I also don’t understand what you mean when you write that this mime type was never interpreted correctly by any client that you know of. When I use the serialization declaration that I mention at the beginning of this thread and examine the mime type of what eXist-db serves to my installation of Chrome, having opened the network view in the Chrome debugging tools, the mime type identified by the browser matches the type I declare in the XQuery. It is also among the mime types that the browser lists as accept in the request headers. I think this combination should mean both that the mime type is conformant with the spec and that at least the current version of Chrome (under MacOS) interprets it correctly.DOCTYPE
if desired, but this is not required to conform to this specification.” I think this tells me that the doctype declaration is not incorrect. According to https://developer.mozilla.org/en-US/docs/Web/HTML/Quirks_Mode_and_Standards_Mode, serving HTML5 with XML syntax does not require a doctype declaration as long as the mime type is application/xhtml+xml
. The MDN page is not a spec, to be sure, but I don’t see anything anywhere in the spec or elsewhere that suggests that using the doctype declaration with XML syntax is incorrect or deprecated.My original question was “how do I do X?“, which tacitly presupposes that it is reasonable for me to want to do X, so a response that “X is incorrect and you shouldn’t want to do it” is constructive and helpful. It is for that reason that I am now trying to unpack my original question (and its underlying, tacit assumptions) into its constituent pieces, so that if I’ve misunderstood either what I should want to do or how to do it, we’ll be able to identify in a more granular where I’ve gone astray:
method
serialization value of xhtml5
. I did not mean to suggest that xhtml5
is a valid value for the serialization method
; I found it mentioned in the correspondence (now several years old) between Martin and Wolfgang that I cite above, which is why I thought it would work (and it does), but one point of my question was whether there was an alternative way to get the serialization I think I want, an alternative that does not use an apparently non-standard method
value.method
value does not appear to be conformant, I would not expect it to work with the newer serialization syntax. But if I am correct in thinking that the serialization that I want is legitimate, I would expect it to be supported with a valid method
value using the newer serialization syntax, and not just the legacy syntax.David 5 hours ago
One more reference: the only method
values listed in https://www.w3.org/TR/xslt-xquery-serialization-31/#xml-output seem to be xml
, xhtml
, html
, text
, json
, and adaptive
. That matches the methods supported by <xsl:output>
in Saxon (https://www.saxonica.com/documentation11/index.html#!xsl-elements/output).
The following serialization strategy, combined with declaring the HTML namespace on the root <html>
element, meets all of our requirements:
declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";
declare option output:method "xhtml";
declare option output:media-type "application/xhtml+xml";
declare option output:omit-xml-declaration "no";
declare option output:html-version "5.0";
Note that the method must be xhtml
, and not html
. With html
empty elements are created as unmatched start-tags (e.g., <br>
), instead of self-closing empty tags (e.g., <br />
). Specifying a method of xhtml
ensures XML-compliant representation of empty elements.
Your to-do
Create XHTML output from eXist-db with XML declaration, doctype declaration, namespace, and media-type
Issues you may encounter
Additional context