Open GoogleCodeExporter opened 9 years ago
I'm debating whether it's worthwhile to do anything about this. If I understand
correctly, we don't need to do the
following escapes.
> in attribute or text values (only in CDATA sections)
' in double quoted attributes or any text
" in single quoted attributes or any text
Should WAX avoid all of these escapes?
Original comment by r.mark.v...@gmail.com
on 29 Sep 2008 at 4:14
I don't have a strong opinion about it.
I like the readability of ' and " instead of ' and " in the resulting XML.
But I like the (admittedly unnecessary) symmetry of < (required) and > (optional).
It's subjective. I could go either way.
Original comment by jeffgr...@charter.net
on 2 Oct 2008 at 2:09
Does anybody know what is technically correct here? I'd like to do whatever the
XML recommendation says we
should do. Is what we are currently doing ... escaping all five special
characters all the time ... considered wrong?
Original comment by r.mark.v...@gmail.com
on 4 Oct 2008 at 4:02
Hey, the issue is marked as "Enhancement" ;-)
I agree with Jeff, i.e. I like plain quotes in text and the symmetry escaping
both <
and >.
Original comment by manosbat...@gmail.com
on 4 Oct 2008 at 4:13
I added a test illustrating the one case where '>' quoting is needed. WAX is
currently doing the correct quoting.
For guidance, I'm looking at the "2.4 Character Data and Markup" section of
this
document:
http://www.w3.org/TR/2008/PER-xml-20080205/#syntax
I think we need this additional test, as a restriction on valid CDATA content:
(You can't have the sequence "]]>" in CDATA.)
@Test
public void testCDATAContainingCDATASectionCloseDelimiter() {
StringWriter sw = new StringWriter();
WAX wax = new WAX(sw);
wax.start("root");
try {
wax.cdata("==]]>==");
fail("Expected IllegalArgumentException.");
} catch (final IllegalArgumentException expectedIllegalArgumentException) {
assertEquals(
"CDATA section data must not contain the CDATA section close
delimiter, ']]>'.",
expectedIllegalArgumentException.getMessage());
}
}
Original comment by jeffgr...@charter.net
on 5 Oct 2008 at 3:27
[ Lightbulb pops up over head! ;-> ]
Or we could just make it work, for the users, instead of restricting/preventing.
That is, instead of doing this in WAX...
if (text.indexOf("]]>") > -1)
throw new IllegalArgumentException("CDATA section data must not contain
the CDATA section close delimiter, ']]>'.");
We could do this:
text("<![CDATA[" + text.replaceAll(Pattern.quote("]]>"), "]]]]><![CDATA[>")
+ "]]>", newLine);
With this test:
@Test
public void testCDATAContainingCDATASectionCloseDelimiter_Supported() throws
Exception {
final StringWriter sw = new StringWriter();
WAX wax = new WAX(sw);
wax.start("root").cdata("==]]>==").close();
final String xmlString = sw.toString();
assertEquals("<root><![CDATA[==]]]]><![CDATA[>==]]></root>", xmlString);
final Document doc = parseXml(xmlString);
doc.normalize();
final Element rootElement = doc.getDocumentElement();
assertEquals("root", rootElement.getNodeName());
assertEquals("==]]>==", rootElement.getTextContent());
}
Original comment by jeffgr...@charter.net
on 5 Oct 2008 at 3:35
Tests for simplified quoting, should we want to implement this feature at some
point...
(The first of the three tests passes today; it just illustrates required '"'
quoting in attributes.)
@Test
public void testAttributeWithSingleQuoteCharacter() throws Exception {
StringWriter sw = new StringWriter();
WAX wax = new WAX(sw);
final String atributeValue = "Bill \"The Man\" Bates";
wax.start("root").attr("a", atributeValue).close();
assertEquals("<root a=\"Bill "The Man" Bates\"/>", sw
.toString());
final Document doc = parseXml(sw.toString());
final Element rootElement = doc.getDocumentElement();
assertEquals(atributeValue, rootElement.getAttribute("a"));
}
@Test
public void testAttributeWithDoubleQuoteCharacter() throws Exception {
StringWriter sw = new StringWriter();
WAX wax = new WAX(sw);
final String atributeValue = "Bill O'Riley";
wax.start("root").attr("a", atributeValue).close();
assertEquals("<root a=\"" + atributeValue + "\"/>", sw.toString());
// final Document doc = parseXml("<root a=\"Bill O'Riley\"/>");
final Document doc = parseXml(sw.toString());
final Element rootElement = doc.getDocumentElement();
assertEquals(atributeValue, rootElement.getAttribute("a"));
}
@Test
public void testTextWithQuoteCharacters() throws Exception {
StringWriter sw = new StringWriter();
WAX wax = new WAX(sw);
final String unquotedTextValue = "Bill \"The Man\" O'Riley";
wax.start("root").attr("a", unquotedTextValue).close();
assertEquals("<root>" + unquotedTextValue + "</root>", sw.toString());
// final Document doc = parseXml("<root>"+unquotedTextValue+"</root>");
final Document doc = parseXml(sw.toString());
final Element rootElement = doc.getDocumentElement();
assertEquals(unquotedTextValue, rootElement.getTextContent());
}
Original comment by jeffgr...@charter.net
on 5 Oct 2008 at 4:18
Best I can tell, the current WAX implementation is correct, as implemented.
So this issue is correctly categorized as an enhancement, not a bug.
It's a style issue. Would the resulting XML look better with less &
quoting? IE: Would it be more readable, ...by humans?
I say, don't stress on it for the 1.0 release.
The down side of this enhancement, is that we'd need two different quoting
methods: One for text, the other for attributes. But that's the essence of
the enhancement, so I wouldn't really consider it an issue.
Original comment by jeffgr...@charter.net
on 5 Oct 2008 at 4:21
Original issue reported on code.google.com by
manosbat...@gmail.com
on 25 Sep 2008 at 3:11