mediacms-io / mediacms

MediaCMS is a modern, fully featured open source video and media CMS, written in Python/Django and React, featuring a REST API.
https://mediacms.io
GNU Affero General Public License v3.0
2.52k stars 459 forks source link

md5sum on uploaded videos before moving them from temporary directory #937

Open KyleMaas opened 6 months ago

KyleMaas commented 6 months ago

Describe the issue I think this is part of the cause of #852. I noticed the other day when MediaCMS was processing an upload, it was running an md5sum on the uploaded video before it responded to the client with a success message. In tracking this down, it looks like this is being done here:

https://github.com/mediacms-io/mediacms/blob/c5047d8df8686d75100e5099489be4fd1bf5f733/files/helpers.py#L273

And going up the call stack:

https://github.com/mediacms-io/mediacms/blob/c5047d8df8686d75100e5099489be4fd1bf5f733/files/models.py#L459

https://github.com/mediacms-io/mediacms/blob/c5047d8df8686d75100e5099489be4fd1bf5f733/files/models.py#L430

Which I think is coming from here:

https://github.com/mediacms-io/mediacms/blob/c5047d8df8686d75100e5099489be4fd1bf5f733/files/models.py#L1373

For those of us with large slow primary video data storage but fast /tmp on a tmpfs, this really should be happening locally on the server processing the upload rather than making the uploading client wait for it to go to the slow disk and then be read back again for the md5sum.

To Reproduce The impact of this is not particularly noticeable unless you're processing huge video files and your storage is slow, so this may not be easy to duplicate without uploading videos >1 hour long. However, if you can set up the conditions correctly:

  1. Upload a long video - at least an hour
  2. In a separate window, open a terminal and watch ps or top for an md5sum command
  3. Notice that this happens very early in the process

Expected behavior The md5sum should be calculated on the uploaded file before being moved from /tmp to primary storage.

Screenshots N/A

Environment (please complete the following information):

KyleMaas commented 5 months ago

https://github.com/mediacms-io/mediacms/blob/c5047d8df8686d75100e5099489be4fd1bf5f733/uploader/views.py#L68