saloonphp / xml-wrangler

🌵 XML Wrangler - Easily Read & Write XML in PHP
MIT License
364 stars 15 forks source link

DOMDocument throwing empty source validation error #29

Open zainavrillo534102 opened 4 months ago

zainavrillo534102 commented 4 months ago

I have an issue today where i was trying to fetch an xml data from soap API. When i tried to use that data and tried to convert it using XML wrangler getting below error . I event tried to validate the xml through the custom DOMDocument class usage and it shows its valid and okay.

function isValidXml($xml): bool
    {
        libxml_use_internal_errors(true);
        $doc = new \DOMDocument();
        $isValid = $doc->loadXML($xml);
        if (!$isValid) {
            $errors = libxml_get_errors();
            foreach ($errors as $error) {
                Log::error('XML Error: ' . $error->message);
            }
            libxml_clear_errors();
        }
        return $isValid;
    }

alueError: DOMDocument::loadXML(): Argument #1 ($source) must not be empty in /home/forge/proclaim.avrillo.co.uk/vendor/veewee/xml/src/Xml/Dom/Loader/xml_string_loader.php:18
--
Stack trace:
#0 /home/forge/proclaim.avrillo.co.uk/vendor/veewee/xml/src/Xml/Dom/Loader/xml_string_loader.php(18): DOMDocument->loadXML()
#1 /home/forge/proclaim.avrillo.co.uk/vendor/veewee/xml/src/Xml/Dom/Loader/load.php(20): VeeWee\Xml\Dom\Loader\{closure}()
#2 /home/forge/proclaim.avrillo.co.uk/vendor/azjezz/psl/src/Psl/Result/wrap.php(23): VeeWee\Xml\Dom\Loader\{closure}()

To Reproduce

  1. Due to senitivity i can't give the xml here as it contains the attachment base64 but when i used this package with the same incoming xml it does not gave any error This was my code snippet
    return XmlReader::fromString($this->xmlData)->removeNamespaces()->values();
Sammyjo20 commented 3 months ago

Hey @zainavrillo534102 thanks for raising this issue with me. I have also found this happen when the XML is slightly invalid, or if you want to parse a subset of HTML/XML. I will mention with the maintainer of veewee/xml and see if I can get a new option that reduces some of the validation.

@veewee I hope you are well! I've had this issue a few times, especially when I want to parse just a small part of HTML like <p>Hello</p>. (Overly simplified I know) but when I use DOMDocument directly it doesn't throw any errors. Is there a configuration option in the reader so it accepts anything assuming it looks like valid XML/HTML?

veewee commented 3 months ago

The error from the initial post is quite clear: an empty string is being loaded as XML which is invalid. This error is being thrown by PHPs Dom document. So one of the methods that are being called is resulting in an empty string I suppose.

Would it be possible to create a simplified reproducer with truncated data @zainavrillo534102 ?

@Sammyjo20 the XML library parses XML, not HTML. I don't know if the XML reader extension supports that dialect. Maybe also for that issue, a reproducer would be nice in order to understand what you are trying to do.

Sammyjo20 commented 3 months ago

Thank you for the very quick reply! Noted about the HTML - I kind of assumed because it looked XML-ish, the reader would understand it, but I won't try and use it for HTML in the future.

In regards to fixing this specific issue if you are able to give us a string that reproduces this issue that would be great!