Take advantage of new Python 2.6 zip handling

GoogleCodeExporter commented 9 years ago

Bookworm should only require 2.5, but add a setting to enable file-based
zip parsing by streaming the uploaded file to the filesystem (in Django),
then opening the temp zip file on the filesystem in streaming mode. 
Discard each in-memory resource as it's processed.

This should dramatically reduce the amount of per-process memory required
when uploading.

Original issue reported on code.google.com by liza31337@gmail.com on 15 Feb 2009 at 4:37

GoogleCodeExporter commented 9 years ago

Do this in concert with file Storage updates.

Original comment by liza31337@gmail.com on 13 Apr 2009 at 4:29

Added labels: Infrastructure

GoogleCodeExporter commented 9 years ago

Original comment by liza31337@gmail.com on 13 Apr 2009 at 4:29

GoogleCodeExporter commented 9 years ago

Cleaned up a lot about file-handling. In the old code an epub was potentially 
being read and re-read 
repeatedly, first in the serialization by Django, then to store it temporarily 
to the filesystem (by me) even 
though it may have already been stored to the filesystem by Django, then to be 
written to the local storage 
class, and then to be re-read into RAM to be handed off to ZipFile.

New method forces Django to always create a temp file (previously it would 
store smaller epubs in memory). 
Then it does a filehandle copy to the storage area. Finally the zipfile module 
should be reading as a filehandle 
too rather than from StringIO in memory.

Not sure what the per-process memory saving is, but the test suite is much 
faster now:

BEFORE (current trunk)
Ran 112 tests in 104.422s

AFTER (infrastructure branch)
Ran 117 tests in 65.673s

Note that the number of tests went up!

Original comment by liza31337@gmail.com on 26 Jul 2009 at 4:48

Changed state: Fixed

srilatha44 / threepress

Take advantage of new Python 2.6 zip handling #100