sabre-io / xml

sabre/xml is an XML library that you may not hate.
http://sabre.io/xml/
BSD 3-Clause "New" or "Revised" License
516 stars 77 forks source link

[Question] parsing current element using elementMap #69

Closed judgej closed 8 years ago

judgej commented 8 years ago

Not sure if this is the right place to ask questions like this, but here goes...

The reading documentation gives examples of a Book object being created using elementMap. With the example, the book element is just a wrapper and contains nothing but child elements.

    <book>
        <title>Snow Crash</title>
        <author>Neil Stephenson</author>
    </book>

The elementMap handler has access to all those child elements, as its context is book.

I have xml that looks more like this:

    <book isbn="123456781234" publisher="Acme Publishing">
        <details title="Snow Crash" author="Neil Stephenson" />
    </book>

So now when handling the book element, I need not only the child elements (and mine goes down a couple of more levels) but also attributes of the book element. How would I access those attributes in a handler like this:

$reader->elementMap = [
    '{whatever}book' => function ($writer) {
        // ???
    },
];

I'm not asking for a complete solution, but just how to access the current book element attributes at ???. I could do it one wrapper element further up, but then have such a complex data structure to handle, it makes elementMap redundant.


Just writing that, the thought has come to me that this book level could be used to simplify the detail (and friends) elements into a flatter structure, then when putting together books there is a much simpler data structure to parse. Would that be a good way to go about this?

evert commented 8 years ago

Hi @judgej , this should do the trick:

$reader->elementMap = [
    '{whatever}book' => function ($reader) {
        $attributes = $reader->parseAttributes();
        /* more stuff here */
    },
];

parseAttributes is a sabre/xml extension, but also don't forget you have the full power of PHP's XMLReader

I'm leaving this ticket open as a bug, because it might mean that our documentation can do better in terms of attributes.

judgej commented 8 years ago

Ah, thanks. I guess what I ultimately am trying to do, is take the attributes of the book tag, and combine them with the value of the book tag, then put the result (a Book object) into the value of the book element. It is kind of like moving the attribute data up a level. This does the trick nicely, thanks.

I notice I need to parse the attributes before I try to parse the inner tree, otherwise the reader context has moved on.

evert commented 8 years ago

Yea the description of your problem made perfect sense. And yea, attributes do need to be read first =)

judgej commented 8 years ago

My parser looks much nicer now, with objects and arrays all created in the proper place.

Another related question, and probably for the same documentation page: all your examples parse the XML into a long-form element array (name/value/attributes keys) where that element is the outermost wrapper element for the XML document. Suppose I want to get rid of that element too - what match is needed on the elementMap? I've tried {}, {}/, '', '/', but none of those work.

So instead of this:

array(
    'name' => '{}books',
    'value' => array of book objects
    'attributes' => []
)

my aim would be to parse the XML file directly into an array of book objects, and not have to dig around in the value element to get that array. So XML file goes in, and an array of books comes out. Is that possible?

evert commented 8 years ago

The top-level will always emit those 3 things, but you can use the Service class instead.

See:

Service::parse() and Server::expect()

evert commented 8 years ago

Both the parse and expect functions parse an entire document, but only return the top-level value.

judgej commented 8 years ago

So parsing always returns a single element at the top level. I'll just grab the ['value'] element, after checking ['name'] is what I expect, and will be good to go. Just thought I may have been missing something. Thanks :-)

Haven't tried the service approach yet, but thanks for pointing it out. I noticed functions in there did not have their privacy declared (public/private/protected). I'm surprised that's not erroring.

evert commented 8 years ago

@judgej so the expect function actually does exactly what you describe. You give it the expected top-level element, it returns the value and throws an exception otherwise.

In regards to dropping public from method signatures, I just wrote a blogpost about that this week: https://evertpot.com/php-code-in-2006-and-2016/

judgej commented 8 years ago

TIL, and I even read that entry when you posted it on reddit a few days ago. I'm sure in the past I've seen "method is not public" errors when accessing a method that did not declare any visibility at all. Probably just dreamt it.

evert commented 8 years ago

Yea the visibility keywords didn't exist in PHP4, and PHP5 was always meant to be compatible with 4, so they've always been optional =)

judgej commented 8 years ago

In the PHP docs:

Methods declared without any explicit visibility keyword are defined as public.

TIL :-)