lookit / lookit-api

Codebase for Lookit v2 and Experimenter v2. Includes an API. Docs: http://lookit.readthedocs.io/
https://lookit.mit.edu/
MIT License
10 stars 18 forks source link

Celery job to handle incomplete video uploads in S3 #1438

Closed becky-gilbert closed 3 weeks ago

becky-gilbert commented 1 month ago

Summary

The AWS S3 multi-part upload system (used in EFP and jsPsych) can result in incomplete video uploads (e.g. if the complete request fails or the participant leaves the study early - see https://github.com/lookit/ember-lookit-frameplayer/issues/395), and researchers are requesting access to those partial recordings. We need an automated task for completing partial/incomplete video uploads.

Description

We are trying to prevent video recording uploads from remaining in an incomplete state (see https://github.com/lookit/ember-lookit-frameplayer/issues/395), but that is not always possible, so we still need a way of handling the video uploads that remain incomplete in S3. We discussed simply deleting the incomplete uploads after some duration, but we have been receiving feedback from researchers that videos from incomplete sessions are still helpful (for verifying that a child participated for compensation, and sometimes obtaining useable data, e.g. when the video recording is mostly complete). Because of the frequency of requests we're getting for partial/incomplete video recordings, it makes sense to create a Celery job to automatically try to complete any incomplete video uploads so that they become available to the researcher.

It is not always possible to complete an incomplete upload. Sometimes the upload was created but is empty, corrupted, or missing parts. In these cases, we should probably just delete the partial upload in order to clear our S3 storage space.

Acceptance criteria