Closed Gounlaf closed 8 years ago
This is not going to work well, because by the time you call read or next, you might actually be deeper inside the xml document. Calling next will also traverse beyond END_ELEMENT
which is why you are never reaching it.
Could you share the xml snippet you are trying to parse, and what PHP data structure you wish to get? I might be able to rewrite it in a way where it's a lot clearer. Ideally you should never really have to do manual traversal like this, unless it's a special circumstance.
Could you share the xml snippet you are trying to parse, and what PHP data structure you wish to get?
I will try to share you only the peace of xml that is wrong.
I might be able to rewrite it in a way where it's a lot clearer. Ideally you should never really have to do manual traversal like this, unless it's a special circumstance.
I use your reader like in the documentation. I just go deeper in the code and found this loop ; i tried to "debug it", tyring to "force" the reader to find the next element. I don't use this loop myself ^^
When my element is parsed, it go an error, but you silence it (https://github.com/fruux/sabre-xml/blob/master/lib/Reader.php#L145);
I var_dumped the content : LibXML Error Input is not proper UTF-8, indicate encoding !
(don't have the complete message right now)
And so, the reader is still looping on the same element, and always go the same error.
Anyway, I will paste you a complete example =)
The error is indeed silenced there, but the error does get stored and we actually take it out here again:
https://github.com/fruux/sabre-xml/blob/master/lib/Reader.php#L151
There might be a bug in the error handling code though. The error you shared definitely seems to indicate so. So it would then be extra awesome to get a snippet of your xml that reproduces this, so we can make the parser more robust =)
Hi @Gounlaf , I received your data via email. Thanks very much for that.
I tried to reproduce the issue with the following script:
<?php
include 'vendor/autoload.php';
$reader = new Sabre\Xml\Reader();
$reader->open('xml_bug_wrong_encoding.xml');
var_dump($reader->parse());
This causes an exception to be triggered immediately:
Sabre\Xml\LibXMLException: Input is not proper UTF-8, indicate encoding !
Bytes: 0x1A 0x29 0x20 0x61
on line 24, column 25 in /Users/evert/code/sabre/xml/lib/Reader.php on line 155
Call Stack:
0.0002 228840 1. {main}() /Users/evert/code/sabre/xml/issue87.php:0
0.0571 587552 2. Sabre\Xml\Reader->parse() /Users/evert/code/sabre/xml/issue87.php:8
0.0571 588200 3. Sabre\Xml\Reader->parseCurrentElement() /Users/evert/code/sabre/xml/lib/Reader.php:69
0.0572 589120 4. call_user_func:{/Users/evert/code/sabre/xml/lib/Reader.php:231}() /Users/evert/code/sabre/xml/lib/Reader.php:231
0.1170 604496 5. Sabre\Xml\Element\Base::xmlDeserialize() /Users/evert/code/sabre/xml/lib/Reader.php:231
0.1170 604912 6. Sabre\Xml\Reader->parseInnerTree() /Users/evert/code/sabre/xml/lib/Element/Base.php:86
0.1171 605328 7. Sabre\Xml\Reader->parseCurrentElement() /Users/evert/code/sabre/xml/lib/Reader.php:161
0.1171 606096 8. call_user_func:{/Users/evert/code/sabre/xml/lib/Reader.php:231}() /Users/evert/code/sabre/xml/lib/Reader.php:231
0.1171 606128 9. Sabre\Xml\Element\Base::xmlDeserialize() /Users/evert/code/sabre/xml/lib/Reader.php:231
0.1171 606128 10. Sabre\Xml\Reader->parseInnerTree() /Users/evert/code/sabre/xml/lib/Element/Base.php:86
I would consider this the expected behavior. Are you also seeing this when you run my test script or do you get the loop?
The problem with your file BTW is that ASCII character 26 (1A) appears in your source. This encodes CTRL-Z and should normally never appear in a text file. But still it shouldn't go in a never-ending loop
Hi @evert, If I use your example, yes the exception is thrown.
But if i use the example given in "http://sabre.io/xml/reading/ => Using the XmlDeserializable interface", there is the loop. I sent you the "looping example" by email.
Would it be possible for you to email me a script that always reproduces the error. It's a bit hard for me to figure out where this is going wrong.
The example you mention on sabre.io should be fine because it just uses parseInnerTree
, but I could be wrong.
So ideally if you could send me a single php file that has a minimal sample that reproduces the problem for you, I would be very grateful!
Hi @evert ,
I did yesterday, with the data I already sent to you, and the "copy/past" of the example, adapted for the data. Anyway, you can find it bellow :)
<?php
include 'vendor/autoload.php';
class Offers implements Sabre\Xml\XmlDeserializable
{
public $data = array();
static function xmlDeserialize(Sabre\Xml\Reader $reader) {
$offers = new self();
$children = $reader->parseInnerTree();
foreach($children as $child) {
if ($child['value'] instanceof Offer) {
$offers->data[] = $child['value'];
}
}
return $offers;
}
}
class Offer implements Sabre\Xml\XmlDeserializable {
static function xmlDeserialize(Sabre\Xml\Reader $reader) {
$offer = new self();
// Borrowing a parser from the KeyValue class.
$keyValue = Sabre\Xml\Element\KeyValue::xmlDeserialize($reader);
// if (isset($keyValue['{http://example.org/books}title'])) {
// $book->title = $keyValue['{http://example.org/books}title'];
// }
// if (isset($keyValue['{http://example.org/books}author'])) {
// $book->author = $keyValue['{http://example.org/books}author'];
// }
return $offer;
}
}
$reader = new Sabre\Xml\Reader();
$reader->elementMap = [
'{}offres' => 'Offers',
'{}offre' => 'Offer',
];
$reader->open('xml_bug_wrong_encoding.xml');
var_dump($reader->parse());
I'm closing this ticket because it's been a while since the last comment.
If this is indeed still an issue, feel free to comment here so we can continue discussing.
This is just a general cleanup. And unfortunately this ticket never got a fully working sample. Might have gotten it via email back in february, but I no longer have it. If you care about this ticket still, please submit the info again (preferably just on github).
Hi,
I have an XML input with encoding problems. I use the default KeyValue element ;
Because of the encoding problem, the function "keyValue" is strucked in the loop ; the reader object never reach the node "Reader::END_ELEMENT".
I've tried both :
It doesn't work, reader styled stucked on the same element =/
Any idea/way to skip element and move further ?
Thanks,
Regards