lookit / lookit-api

Codebase for Lookit v2 and Experimenter v2. Includes an API. Docs: http://lookit.readthedocs.io/
https://lookit.mit.edu/
MIT License
10 stars 18 forks source link

Investigate video deletion from S3 #1423

Closed becky-gilbert closed 1 month ago

becky-gilbert commented 2 months ago

Summary

As part of the work to address Sentry errors, we found that it's possible for the 'delete videos from cloud' celery task to try to delete a video that does not exist in the database (addressed here #1412). There are 356 instances of this 'Video.DoesNotExist' error recorded by Sentry in our production environment. This raised some questions about how the normal sequence of events for deleting videos should work.

Update: the testing below confirms that the 'Video.DoesNotExist' error was not due to an edge case. It happens with our normal video deletion process. The video object is being deleted from the database before the 'delete video from S3' celery task runs, and the code in that task tries to access the video in the database to get the filename. At that point the video is gone from the database, which produces the error, and the 'delete from S3' task fails because there's no filename for it to find.

To do

Test the video deletion process by deleting a video via the site, and verifying that it has been deleted both in the database and on S3. If the video does not exist in the database at the time when the S3 deletion task runs, then there could be an error that causes the S3 deletion to fail. The S3 deletion might take up to 7 days (per code comments). We will know if this process has failed if the video still exists in S3 after 7 days. This test might also trigger a new 'video does not exist' error log in Sentry.

Open questions

What are all the user actions that can trigger video deletion?

Possible code changes

If there is a problem with the normal S3 deletion process, a possible fix would be to delete the video from the database and from S3 at the same time. (Though perhaps there was a reason for making this a delayed task, e.g. we want the video deletion to be 'soft', so that the video can be recoverable for a short period).

Another option is to pass the video file name into the delayed S3 deletion task, so that this task does not need to access the (since deleted) video from the database to get the filename.

Update: this is moved to #1430.

Other steps

Even if we make the code changes above, we will need to address the backlog of video files in S3 that should've been deleted. We can do this by (1) going through the list of "Video.DoesNotExist" Sentry errors and deleting those videos from S3, and/or (2) comparing the video file names from S3 with those in our DB and removing any from S3 that do not exist in our DB.

Update: this is moved to #1431.

becky-gilbert commented 1 month ago

Closing this as completed - testing has confirmed the video deletion problems, which will now be handled via #1430 and #1431.