TECLIB / CFPropertyList

PHP Implementation of Apple's PList (binary and xml)
http://teclib.github.io/CFPropertyList/
MIT License
454 stars 113 forks source link

Error reading large plist files #33

Open jacobraccuia opened 9 years ago

jacobraccuia commented 9 years ago

thanks for the plist reader!

I am using a form to have users submit their itunes.xml plist to the site, which then parses the data and displays statistics about their listening habits.

It works fine when uploaded files ~ 20mb, but a 72mb is throwing the following error.

Warning: DOMDocument::loadXML(): (null)(null)xmlSAX2Characters: out of memory in Entity, line: 538384 in ..../classes/CFPropertyList/CFPropertyList.php on line 267.

I've increased the memory on my php server to 528mb and have a long time out to help troubleshoot this, but I didn't get anywhere.

You can test it yourself at: http://jacobraccuia.com/most_listened.php

Is this a server error or bug here?

Thanks

lpotherat commented 8 years ago

I think CFPropertyList will not be able to read large files, because it is not a stream parser. Even if we parse the xml with stream based parser, the class will still store all the informations in memory, wich is impossible with large files. A stream based CFPropertyList reader could be fine, but it's a lot of work !

YannickGagnon commented 8 years ago

Hey @jacobraccuia, it's not a bug and it's not really a server error either. Once you load a file into CFPropertyList, it will start creating object oriented entities for all objects. Unfortunately, there is no other option than storing them into memory. As I see it, you have two options: You either increase your memory limit, execution time limit and you limit the size of the file upload or you work directly in XML with XPath.

Hope this helps

jasper2virtual commented 7 years ago

Hi @lpotherat , I understood, CFPropertyList is not a stream parser, I have a headache problem too. My project plist files are very large. I have a idea, is it possible to use another stream reader to read my plist file first. to read each repeating node, then pass the node xml text into CFPropertyList. For example:

  1. use XMLReader to parse my plist file
  2. in the XMLReader reading loop, each time I capture a node I want, I pass it to create a new CFPropertyList object by loadXMLStream? is it possible? anyway I will try it today.
lpotherat commented 7 years ago

Sorry for the late answer, do you have any news on your tests ? @jasper2virtual

jasper2virtual commented 6 years ago

Hi @lpotherat , Oh yes, I have tried it and create the pull request https://github.com/rodneyrehm/CFPropertyList/pull/37

rodneyrehm commented 6 years ago

Please note that per #36 we're looking for someone to take over maintenance of this project. I have moved on from PHP and neither the time nor the desire to keep working on CFPropertyList.