Closed GoogleCodeExporter closed 9 years ago
This is correct.
Is there a reason why text_output_escaped should escape ' to '?
Original comment by arseny.k...@gmail.com
on 4 Oct 2012 at 3:18
The reasons I think:
1. The XML standard 1.0 (Fifth Edition) describes a set of predefined entities
as (amp, lt, gt, apos, and quot). For example:
http://www.w3.org/TR/2008/REC-xml-20081126/
2. Some XML editors believe unshielded apostrophe mistake. However, I can't
remember what exactly is the editor thinks so. :-(
3. The standard allows surround values apostrophe instead of quotation marks.
If the value is present apostrophe, it will be ambiguity.
4. Method strconv_escape decodes sequence. So text_output_escaped must perform
the reverse conversion.
All of the above only IMHO.
Original comment by Pot...@gmail.com
on 5 Oct 2012 at 8:25
While ' is certainly an allowed entity, pugixml tries to preserve the text that
can be preserved as is, without encoding it. For example, while it is possible
to encode non-ASCII characters as escape sequences, pugixml chooses not to do
so so that localized text is left as is.
The standard does allow to surround the attribute value with apostrophes;
however, in this case you have to encode apostrophes but can choose to leave
quotation marks as is. pugixml does not have an option to surround attribute
values with apostrophes during writing yet.
In short, outputting unescaped apostrophes is perfectly compliant with XML
standard; any tool that does not recognize this violates the standard. If you
have an example of a tool or a library that does not work with whatever pugixml
outputs, please tell me and I'll reopen the issue; otherwise, I'd prefer to
leave this as it is.
Note that this might change when/if pugixml starts supporting
apostrophe-surrounded attribute values during printing.
Original comment by arseny.k...@gmail.com
on 11 Oct 2012 at 4:43
Original issue reported on code.google.com by
Pot...@gmail.com
on 4 Oct 2012 at 8:22Attachments: