Open straight-shoota opened 4 years ago
IMO Crystal API XML::Builder
shouldn't be exposing text
, attribute
, attributes
etc at Document
level, as these are element level things and should only be made available to element
(first part of your example above).
So this isn't a problem with the underlying library, but how we have encapsulated that into API.
The XML::Builder
interface is agnostic to this. There's no way to statically restrict the methods you can call on it. That depends on the current context, which is a runtime property.
What you describe is actually a different problem and not really related to this issue. This is about an entirely valid represenation of a DOM (a document fragment containing a text node) being improperly created. This is not a failure of the API (which is completely fine as per my example above) but the implementation is flawed.
When using
XML::Builder
, the#text
method is supposed to insert text content and make sure it's properly escaped to not allow injecting content that could be interpreted as tags. This generally works as expected, but there's a weird special case when text is added at the top scope of a document fragment, i.e. not wrapped in an element.The reason for this is in libxml2's
xmlTextWriterWriteString
method which for some reason seems to not escape the string if there's no entry on the writer stack:https://github.com/GNOME/libxml2/blob/bfc0f674ccf5c3929ce22e29307a6ffae17428ad/xmlwriter.c#L1491-L1524
I have no idea what's the reason for this and I can't fidn any further documentation nor bug reports for this. But it's really hard to find such things for libxml2. Maybe I should open an issue at https://gitlab.gnome.org/GNOME/libxml2/-/issues but I can't imagine this hasn't come up before, so there must be some information, I just can't find it. Maybe someone here has an idea?
The behaviour of our exposed API is certainly unexpected and a potential security issue because
text
should be expected to always escape, no matter the builder state. We don't even expose a#raw
method yet which could be used for explicitly escaped content (for example usingHTML.escape
).