Enforce XML constraint that attribute names must be unique within an Element start/empty tag.

GoogleCodeExporter commented 9 years ago

To more strictly enforce common XML validation rules, we should enforce the
constraint that attribute names must be unique within an element.

IE: This XML would NOT be allowed:
  <root dupName="value1" dupName="value2" />

Spec:
  http://www.w3.org/TR/2008/PER-xml-20080205/#sec-starttags
    "Well-formedness constraint: Unique Att Spec
    "An attribute name MUST NOT appear more than once in the same start-tag
or empty-element tag."

JUnit:
    @Test
    public void testBadAttribute_DuplicateName() {
        StringWriter sw = new StringWriter();
        WAX wax = new WAX(sw);
        wax.start("root");
        wax.attr("name", "value1");
        try {
            wax.attr("name", "value2");
            fail("Expected IllegalArgumentException.");
        } catch (IllegalArgumentException expectedIllegalArgumentException) {
            assertEquals("The attribute \"name\" is defined twice in this
element.",
                    expectedIllegalArgumentException.getMessage());
        }
    }

Implementation Ideas:
 - Knowing me,  ;->  I would probably put this functionality into the
ElementMetadata class.
 - And I'd probably take this opportunity to move the 'pendingPrefixes'
field into the ElementMetadata class too.  I'd move the 'verifyPrefixes()'
to ElementMetadata (but probably not move the 'buildQName' method).

Original issue reported on code.google.com by jeffgr...@charter.net on 8 Oct 2008 at 11:18

GoogleCodeExporter commented 9 years ago

I definitely think we need to fix this.

Original comment by r.mark.v...@gmail.com on 9 Oct 2008 at 2:05

GoogleCodeExporter commented 9 years ago

Done with Revision 122, and this checkin comment:

Issue 46:  Enforce XML constraint that attribute names must be unique within an
Element start/empty tag.

This checkin enforces the XML Namespace attribute uniqueness rule:
http://www.w3.org/TR/2004/REC-xml-names11-20040204/#uniqAttrs

"6.3 Uniqueness of Attributes
"In XML documents conforming to this specification, no tag may contain two 
attributes
which:
" 1. have identical names, or
" 2. have qualified names with the same local part and with prefixes which have 
been
bound to namespace names that are identical. 
"This constraint is equivalent to requiring that no element have two attributes 
with
the same expanded name.
"For example, each of the bad start-tags is illegal in the following:

<!-- http://www.w3.org is bound to n1 and n2 -->
<x xmlns:n1="http://www.w3.org" 
   xmlns:n2="http://www.w3.org" >
  <bad n1:a="1"  n2:a="2" />
</x>
"
[One element dropped from their example, as it was covered in the base 
non-namespace
XML case.]

Memory Footprint:
The Set of XML Namespace prefixes for each Element in the Stack has been 
generalized
to a Map of prefix names to their URL strings, so that we could do the correct 
XML
Namespace validation of expanded attribute name uniqueness.  This increases the
memory footprint by a small, effectively constant, amount:  One String per 
level --
the maximum depth of the XML document.

Also refactored 'pendingPrefixes' into ElementMetadata, renaming it
'unverifiedNamespacePrefixes'.

Additional refactoring:  Change 'elementStack.size() == 0' to 
'elementStack.empty()'.
 (Left 'elementStack.size() > 0' expressions as-is, as '!elementStack.empty()' seemed
harder to read.)

Original comment by jeffgr...@charter.net on 10 Oct 2008 at 7:48

Changed state: Verified

GoogleCodeExporter commented 9 years ago

Hmmm...
Probably need to implement processing that defining a namespace prefix to an 
empty
string URL "undefines" it -- making it INVALID in that scope.

See: http://www.w3.org/TR/2004/REC-xml-names11-20040204/#scoping-defaulting

Example:
<?xml version="1.1"?>
<x xmlns:n1="http://www.w3.org">
    <n1:a/>               <!-- legal; the prefix n1 is bound to http://www.w3.org -->
    <x xmlns:n1="">
        <n1:a/>           <!-- illegal; the prefix n1 is not bound here -->
    <x xmlns:n1="http://www.w3.org">
            <n1:a/>       <!-- legal; the prefix n1 is bound again -->
        </x>
    </x>
</x>

Original comment by jeffgr...@charter.net on 10 Oct 2008 at 9:28

GoogleCodeExporter commented 9 years ago

Added ability to undefine namespaces in revision 129 per section 6.1 of the
"Namespaces in XML" REC:

http://www.w3.org/TR/2004/REC-xml-names11-20040204/#scoping

"The attribute value in a namespace declaration for a prefix MAY be empty. This 
has
the effect, within the scope of the declaration, of removing any association of 
the
prefix with a namespace name."

That should completely close out all functionality changes for this enhancement.

Original comment by jeffgr...@charter.net on 13 Oct 2008 at 3:01

mvolkmann / waxy

Enforce XML constraint that attribute names must be unique within an Element start/empty tag. #46