sabre-io / xml

sabre/xml is an XML library that you may not hate.
http://sabre.io/xml/
BSD 3-Clause "New" or "Revised" License
516 stars 77 forks source link

mixed text/element in node discards text #107

Closed admonkey closed 8 years ago

admonkey commented 8 years ago

If my document has mixed text/elements in a node, for example:

<root>
   <node>
      some text
      <sub>
          deep text
      </sub>
   </node>
</root>

then how can I parse some text?

It seems to be simply ignored/discarded when parsed.

Thanks!

evert commented 8 years ago

Hi @admonkey,

This is by design. The goal of the default deserializer is to easily parse the most common-case. Adding support for mixing text and elements would make parsing unnecessarily complicated.

However, there are specific areas where this makes sense. One example is XHTML embedded into Atom files.

So the question here is, what do you want to do with the data? If your actual use-cases is indeed XHTML inside Atom, you likely want access to the entire sub-document. In that case you'll probably want:

$service->elementMap('node', 'Sabre\Xml\Elements\XmlFragment');

If there's another more specific parsing strategy you want, you just have to define your own deserializer:

$service->elementMap('node', function(Sabre\Xml\Reader $reader) {

    // do stuff here and then return.
});
evert commented 8 years ago

Closing this issue, but feel free to follow up and ask more questions.