collective / transmogrify.ploneremote

Transmogrifier blueprints for uploading content via xmlrpc to a plone site
http://pypi.python.org/pypi/transmogrify.ploneremote
7 stars 6 forks source link

MemoryError #4

Closed zopyx closed 13 years ago

zopyx commented 13 years ago

WegweiserPsychiatrie.pdf INFO:ploneupdate:aktuelles/publikationen/documents/broschuere-ldsweb.pdf set fields=['file'] INFO:plonepublish:aktuelles/publikationen/documents/broschuere-ldsweb.pdf performing transition 'publish' INFO:plonealias:aktuelles/publikationen/documents/broschuere-ldsweb.pdf Adding redirection from /aktuelles/publikationen/documents/broschuere_ldsWeb.pdf Traceback (most recent call last): File "bin/funnelweb", line 115, in funnelweb.runner.runner({'ploneupload': {'target': 'http://admin:123@localhost:13080/Plone'}, 'crawler': {'url': 'file:///home/ajung/www.dahme-spreewald.de'}}) tried to import http://www.dahme-spreewald.de/

There are only 380 HTML pages and several hundred images and files in the site... I can not understand why it would fail on my box with 4GB of RAM at import time...that's really not a huge amount of data.

File "/home/ajung/.buildout/eggs/funnelweb-1.0b4-py2.6.egg/funnelweb/runner/init.py", line 108, in runner transmogrifier(u'transmogrify.config.funnelweb', overrides) File "/home/ajung/.buildout/eggs/collective.transmogrifier-1.2-py2.6.egg/collective/transmogrifier/transmogrifier.py", line 62, in call for item in pipeline: File "/home/ajung/.buildout/eggs/transmogrify.webcrawler-1.0b4-py2.6.egg/transmogrify/webcrawler/staticcreator.py", line 34, in iter for item in self.previous: File "/home/ajung/.buildout/eggs/transmogrify.ploneremote-1.0b2-py2.6.egg/transmogrify/ploneremote/remoteprune.py", line 88, in iter for item in self.previous:
File "/home/ajung/.buildout/eggs/transmogrify.ploneremote-1.0b2-py2.6.egg/transmogrify/ploneremote/remoteredirector.py", line 25, in
iter for item in self.previous: File "/home/ajung/.buildout/eggs/transmogrify.ploneremote-1.0b2-py2.6.egg/transmogrify/ploneremote/remoteworkflowupdater.py", line 40, in iter for item in self.previous: File "/home/ajung/.buildout/eggs/collective.transmogrifier-1.2-py2.6.egg/collective/transmogrifier/sections/inserter.py", line 19, in iter for item in self.previous: File "/home/ajung/.buildout/eggs/transmogrify.ploneremote-1.0b2-py2.6.egg/transmogrify/ploneremote/remotenavigationexcluder.py", line 34, in iter for item in self.previous: File "/home/ajung/.buildout/eggs/transmogrify.ploneremote-1.0b2-py2.6.egg/transmogrify/ploneremote/remoteschemaupdater.py", line 79, in iter input = urllib.urlencode(arguments) File "/opt/python-2.6/lib/python2.6/urllib.py", line 1269, in urlencode v = quote_plus(str(v)) File "/opt/python-2.6/lib/python2.6/urllib.py", line 1230, in quote_plus s = quote(s, safe + ' ') File "/opt/python-2.6/lib/python2.6/urllib.py", line 1224, in quote res = map(safe_map.getitem**, s) MemoryError

djay commented 13 years ago

This is odd. We've done some biggish sites without a problem. Currently however funnelweb does keep all content in memory which could be an issue, particularly if you have some large pdf documents. I do have an idea on how to reduce this down by leaving call File objects stored in the diskcache until they are uploaded since they are unlikely to have the contents modified.

zopyx commented 13 years ago

Actually I dropped funnelweb from my radar.

I took my own migration framework and was able to import the site into Plone within 5 minutes.

djay commented 13 years ago

I appreciate the time taken to submit the bugs as the fixes will help others. If you have any tips on making funnelweb easier to use other than fixing its bugs then let us know.

djay commented 13 years ago

Implementation has change to not keep files in memory.