sudham / ticpp

Automatically exported from code.google.com/p/ticpp
0 stars 0 forks source link

Whitespace is collapsed by default (violates XML spec) #33

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
According to XML 1.0 Recommendation, section 2.10
(http://www.w3.org/TR/xml/#sec-white-space), "An XML processor MUST always
pass all characters in a document that are not markup through to the
application."

TinyXML++ follows TinyXML's way to do things: collapse whitespace by
default, and allow changing that setting with SetCondenseWhiteSpace,
because "the world hasn't agreed on whether whitespace should be kept" (?).

No spec-compliant XML parser should be collapsing whitespace. Some may have
convenience methods to collapse whitespace, or remove whitespace from
beginning and end; but that's just convenience that doesn't break anything.

Not even HTML collapses whitespace when parsing. It's done at the final
layout step, before rendering: if the element doesn't have the white-space
CSS attribute set to pre, multiple space elements are shown as one.

Original issue reported on code.google.com by nicolas....@gmail.com on 4 Dec 2008 at 3:17

GoogleCodeExporter commented 9 years ago
I had a problem with this too. tinyxml++ really should not touch the text 
content in xml nodes. In my case I needed to parse a xml file that has text 
nodes like this:

<properties>
name=value1
type=some_type
color=23
width=45
</properties>

With the default settings tinyxml++ removed all line endings and makes my 
internal parser for a node text content confused.

Regards,
Łukasz

Original comment by chartsac...@gmail.com on 7 Nov 2010 at 2:07