Closed GoogleCodeExporter closed 9 years ago
Why is it a problem?
There's no requirement for the apostrophe character to be escaped in XML;
pugixml tries to escape as little data as possible to preserve readability
while producing well-formed output.
Original comment by arseny.k...@gmail.com
on 23 May 2013 at 3:36
Please see the wiki.
http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Pr
edefined_entities_in_XML
"The XML specification defines five "predefined entities" representing special
characters, and requires that all XML processors honor them."
Original comment by marek.k...@gmail.com
on 23 May 2013 at 4:29
Note that the Wikipedia is not an authoritative source of information wrt XML
parsing. It's likely that in this case "honor" means "decode while parsing",
not "encode while saving".
Please refer to the XML standard (http://www.w3.org/TR/REC-xml/) for further
information; it clearly states that attribute values can contain unescaped
apostrophe values:
[10] AttValue ::= '"' ([^<&"] | Reference)* '"'
| "'" ([^<&'] | Reference)* "'"
And slightly related quote:
To allow attribute values to contain both single and double quotes, the apostrophe or single-quote character (') may be represented as " ' ", and the double-quote character (") as " " ".
In attribute values, pugixml chooses to escape > for symmetry reasons (< has to
be escaped to conform to XML standard), but to not escape ' for increased
output readability.
Original comment by arseny.k...@gmail.com
on 23 May 2013 at 5:10
Thanks for the explanation! You can close the issue.
Original comment by marek.k...@gmail.com
on 24 May 2013 at 5:06
Original comment by arseny.k...@gmail.com
on 25 May 2013 at 3:39
Original issue reported on code.google.com by
marek.k...@gmail.com
on 22 May 2013 at 3:26