Closed GoogleCodeExporter closed 8 years ago
Okay. Here's the relevant constraints on the data model. We can regard this as
"direct construction", perhaps, or if it's hitting the builder from a parser,
then it's from-infoset (and validation is presumably from-psvi). Regardless,
the constraints always hold.
XQuery Data Model, 6.2 Element Nodes, 6.2.1 Overview, ordered list item #12:
For every expanded QName that appears in the dm:node-name of the element, the
dm:node-name of any Attribute Node among the attributes of the element, or in
any value of type xs:QName or xs:NOTATION (or any type derived from those
types) that appears in the typed-value of the element or the typed-value of any
of its attributes, if the expanded QName has a non-empty URI, then there must
be a prefix binding for this URI among the namespaces of this Element Node.
If any of the expanded QNames has an empty URI, then there must not be any
binding among the namespaces of this Element Node which binds the empty prefix
to a URI.
endquote
From this, we can confidently state: an unbound namespace must either cause an
exception to be thrown, or must trigger some form of "namespace fixup".
ContentHandler (the base interface in question here; SequenceHandler and the
builders layer on top of this) is a streaming interface. It's actually the
event handler for a sequential messaging interface, a sink for which the source
is undefined (model.stream() and cursor.write() are potentially sources, but so
are parsers and validators, effectively).
It's easy enough to know to throw an IllegalStateException when, for instance,
an attribute, namespace, or non-whitespace text event follows a document event
(although arguably the latter is not an error, for the XDM, since its
'document' node type can represent an XML entity, which might simply be a text
block).
The complication that arises is that a ContentHandler may or may not be able to
handle fixups or exceptions "out of order". For instance, suppose that an
element event has a node-name with the default prefix bound to the default
namespace uri, but one of the subsequent namespace events contains a binding of
a non-default uri to the default prefix. It could reasonably throw the
exception for the namespace event. But, for more complexity, suppose that the
element name uses a non-default prefix bound to a non-default uri, which is
declared in its namespaces property (that is, a subsequent namespace event),
but then there is an attribute (attribute event) which supplies the same prefix
hint but a different uri. For qnames in content, the problem is potentially
even worse, of course.
First consequence: it appears that we may need a method on ContentHandler which
responds to the question: how is namespace fixup handled? Choices are throwing
an exception versus fixup, but there's another option, of "expecting
well-formedness"--that is, not even checking. Checking/not checking and
fixup/fail. Do we need to do something like this? If you *know* that your
handler is going to receive well-formed XML (because there's something like an
XML parser making sure that that is true), then a lot of complexity (and
potentially expensive state-keeping) can be avoided. If not, then perhaps you
*ought* to check.
How you handle checking is also an interesting question. For instance, we
*could* provide a "NamespaceFixupHandler" in bridgekit, which would be
instantiated with a non-checking downstream Handler. This would mean that the
code could be written generically, and each bridge that used it could rely upon
it being correct input to the FragmentBuilder. Or ProcessingContext's
newFragmentBuilder could accept a parameter, ContentHandler filter. Oh, hmmmm.
Brainstorming there, but that seems rather powerful.
Original comment by aale...@gmail.com
on 17 Jan 2012 at 6:10
resolved. not entirely certain that this is a correct resolution, but the core
requirements seem addressed.
Original comment by aale...@gmail.com
on 7 Feb 2012 at 4:02
Original issue reported on code.google.com by
aale...@gmail.com
on 12 Jan 2012 at 8:48