ImperialCollegeLondon / django-drf-filepond

A Django app providing a server implemention for the Filepond file upload library
BSD 3-Clause "New" or "Revised" License
103 stars 39 forks source link

Add new ChunkedUploadedFile object to resolve memory usage issues when combining file chunks #82

Closed jcohen02 closed 1 year ago

jcohen02 commented 1 year ago

This update addresses issue #64 by adding a new DrfFilepondChunkedUploadedFile object that is a specialisation of django.core.files.uploadedfile.UploadedFile. This new object avoids the need to separately combine file chunks in memory when creating a complete file object for storage.

Previously, we combined all the file chunks in a BytesIO object and then used this as a the basis for an InMemoryUploadedFile object. This resulted in the content of the BytesIO object being copied and resulting in two in-memory copies of the data and double the memory usage of the individual file size. For large uploads this presents an issue as highlighted in #64.

The new object can be passed directly to a TeporaryUpload obejct for saving. When the file content is read, the logic in the DrfFilepondChunkedUploadedFile object handles opening the relevant chunk file on disk and returning the requested data. This should resolve the issue of high memory usage.