Open ormsbee opened 5 months ago
Some more thoughts on this...
We can delete old PublishableEntityVersions
as new ones are created–no need to wait for publish. We should be able to do this relatively quickly, particularly since we'd only be deleting one at a time in that case.
The hard part about pruning is determining which Content
are safe to delete. The components
app knows how to prune unused Content
for Content
that it has associations with, but other things might use that same content. Esp. if we model large collections of files as something other than Components
. Also, pruning the files backing Content
will be a slower operation.
If we're willing to allow Content
pruning to be slow, we can have a pluggable thing where multiple apps get to contribute querysets to exclude from pruning.
So Content
pruning would do this:
So this prune gets called more periodically, say after a publish. It also works in small increments.
Edge case: Multiple Content
entries can point to the same backing file if they're of different media types, so we need to be careful not to delete that file if there is any other Content
referencing it.
The
PublishLog
and having LearningPackage-local Content entries makes it easier for us to do pruning in small cycles, like as a post-publish task.Proposed Solution
Step 1: As a post-publish async task for any given
PublishableEntity
, delete allPublishableEntityVersions
that are older than a certain period (1 week?), but preserve the following:PublishableEntityVersion
that has ever been published (appears in aPublishLogRecord
)Rely on cascading deletion behavior to delete
Component/ComponentVersion
.Step 2: After the deletions in Step 1, find any unreferenced
Content
entries and delete those.