laminas / laminas-feed

Consume and generate Atom and RSS feeds, and interact with Pubsubhubbub.
https://docs.laminas.dev/laminas-feed/
BSD 3-Clause "New" or "Revised" License
145 stars 26 forks source link

DOMDocument::loadXML(): CData section too big found #86

Closed sreekeshkamath closed 2 months ago

sreekeshkamath commented 2 months ago

Bug Report

Q A
Version(s) 2.22.0

Summary

If data of more than 10 MB is passed to setContent() with <![CDATA[{$content}]]>, we get CData section too big error.

Please note that CDATA is still not being passed to the setContent() as mentioned here.

Current behavior

Error is being thrown and execution stops at loadXML() located in Writer/Renderer/Entry/Atom.php.

How to reproduce

The following is in PHP:

While creating the feed, simply call the below function insidesetContent()

private function getCdataFromContent(?string $content = ''): string
    {
        $content = str_repeat('This is a large text. ', 1000000); // Creates a large string over 10MB
        return "<![CDATA[{$content}]]>";
    }

Expected behavior

XML should be generated for files larger than 10MB.

froschdesign commented 2 months ago

The XML_PARSE_HUGE flag as option for DOMDocument::loadXML can help here, but this could lead to problems on the consumer side, because it is not certain if a parser can process the huge file. What do you put in a feed that is this big?

sreekeshkamath commented 2 months ago

It's basically a feed produced by the client who's content exceeds 10 million characters, which is why we need to have XML_PARSH_HUGE flag for loadXML(). Can it be put as an optional argument in the package?

froschdesign commented 2 months ago

Can it be put as an optional argument in the package?

No, because a size of 10 MB for a feed does not correspond to a regular usage and possible problems on the consumer side.

But you can extend the relevant classes for your use case:

https://github.com/laminas/laminas-feed/blob/669792b819fca7274698147ad7a2ecc1b0a9b141/src/Writer/Renderer/Feed/Atom.php#L63-L65

Then render the feed with your custom feed and entry classes:

$renderer = new MyCustom\Feed\Atom($feed);
$renderer->setType('atom');
$xml = $renderer->render()->saveXml();
sreekeshkamath commented 2 months ago

Oh ok, thought it might have been a bug. Then the only solution would be to extend the class. Thanks anyways!