devilry / devilry-django

Devilry project main repository
http://devilry.org
BSD 3-Clause "New" or "Revised" License
51 stars 24 forks source link

Limit memory usage of ZIP creation for container environment #1278

Closed torgeirl closed 5 months ago

torgeirl commented 9 months ago

When hosting Devilry as a service in a container orchestration environment, the memory allocation associated with the creation of compressed archives (up to multiple GBs) are problematic.

Suggested solutions a) either add an option for streaming the construction of the compressed archives using something like zipfly or stream-zip,

b) or implement an option where the ZIP creation use disk (in practice a persistent volume of the container's pod) instead of memory before it is moved from disk the storage location (as defined in settings; in our case S3 storage).

Levijatan commented 6 months ago

Working on a stream-zip solution, have zip creation of feedbacksets for examiner done. Just need to implement it for the other places. It seems to work realy well. It does not have to use local storage since it does the zip creation in small chunks when the zip gets streamed to the user. So small memory footprint and no local storage needed.

Levijatan commented 6 months ago

Commited fix for this issue in 4b7965e

torgeirl commented 6 months ago

Changes to how zips are created makes CompressedArchiveMeta obsolete.

The use of Compressed archive metas was quite limited, but it was useful to easily find the path and (especially) its time of deletion without having to browse the contain folder(s). That is no longer displayed anywhere in Django admin?

We currently delete older ZIP files with automated scripts (ie python manage.py devilry_delete_compressed_archives --days 14), but with S3 it's perhaps better to set a time to live upon upload?