openedx / modular-learning

3 stars 1 forks source link

[Libraries] [Collections] [Backend] Soft Delete collections #231

Open pomegranited opened 1 month ago

pomegranited commented 1 month ago

Currently, only "enabled" collections are displayed in the frontend, so "disabled" collections can be considered to be "soft deleted".

Implement APIs and views to support "soft deleting" collections, and add Django Admin pages for our collection models.

openedx-learning

Python API:

Add Collections model Django Admin, using basic configuration options to:

Don't worry about making a Collection's components add/removable here (because there can be lots), we're just worried about Collection editing for now.

edx-platform

Add library collection views:

Out of scope

  1. Soft-deleted collections are not automatically cleaned up -- Django Admins must do this manually.
bradenmacdonald commented 1 week ago

@sdaitzman @lizc577 CC @jmakowski1123 Would one of you be able to answer @pomegranited's question here? When a collection is deleted, is it "gone gone", or should we kind of disable it and hide it, but keep the ability to potentially un-deleted it again in the future?

Just straight-up deleting it is the easiest to implement. Keeping it around but enabled may have issues if someone later tries to import a collection that has the same name/ID/key (though we can find a way to handle that).

sdaitzman commented 1 week ago

@bradenmacdonald @pomegranited I'll defer to @jmakowski1123 since this is more a product question, but here are some general thoughts. We can also talk more at our meeting later today.

If there's limited incremental cost in retaining collections for e.g. 90 days, I think it would make sense to implement that way. Even if collections are only recoverable by administrators/not through the UI initially, the ability to recover deleted content is important. Some trash/recovery workflow is common/borderline expected in a lot of authoring software with similar UI. At some point in the future, it would make sense to design and introduce a restore workflow. The existing "publish library" screen and publishing workflow are very simple right now, and we could add restoring/archiving content to that workflow in a future version. Smaller details like a temporary "Undo" button could also improve the experience.

If implementing soft-delete will add scope or more significant cost/delay, I think I'm okay with holding off for the libraries MVP. I'd prioritize shipping and getting real user feedback over a restore-collection workflow or recovery option.

bradenmacdonald commented 1 week ago

@pomegranited How much work will soft delete be? We discussed this on a call, and if you can find a relatively simple way to implement soft delete (so that e.g. an admin could un-delete it if needed), let's go with that for now. The main problem to solve is that if someone later imports (or creates) a new collection with the same key, we need to not throw an error (either by permanently deleting the soft-deleted collection or changing its key?).

pomegranited commented 1 week ago

Hiya @bradenmacdonald @sdaitzman Soft-deleting is simple to implement, and a Django-Admin-only UI for managing the details seems sufficient for now.

The main problem to solve is that if someone later imports (or creates) a new collection with the same key, we need to not throw an error (either by permanently deleting the soft-deleted collection or changing its key?).

Our current solution for creating collections handles auto-assigning and de-duplicating the key for new collections, so this won't be an issue unless we change the create flow.

I think we can implement this with 3 tickets:

  1. [Backend] Delete Collection API + views + refactor event handlers (2 PRs)
  2. [Backend] Manage Collections in Django Admin (1 PR)
  3. [Frontend] Delete Collection with Undo Toast (1 PR)

openedx-learning

edx-platform:

frontend-authoring:

Out of scope

  1. Soft-deleted collections are not automatically cleaned up -- Django Admins must do this manually.
  2. There's a potential performance concern with updating the search index -- deleting a collection from the index is quick, but we also need to remove it from all of the content objects the collection contains. Currently we handle these search index updates synchronously for responsiveness in the MFE (cf discussion). But when a deleted collection contains many content objects, we may need allow for asynchronous reindexing of its content objects.
bradenmacdonald commented 1 week ago

@pomegranited Yeah I'm more worried about imports in the future (where we need to keep the key the same as whatever's in the imported data file) than creating collections. We can solve that problem down the road; we're not implementing imports for MVP.

[Backend] Manage Collections in Django Admin (1 PR)

I wouldn't spend much time on this. We don't actually have a requirement to allow soft-deleting collections via the admin or any other means. It's just if we have the option of implementing it as soft-delete or hard-delete on the backend, for now we'll go with soft-delete, but we don't have to implement the whole delete-then-recover workflow in the admin UI.

when a deleted collection contains many content objects, we may need allow for asynchronous reindexing of its content objects.

Good point. I think that's fine. We can just make all collection deletions async in terms of updating the items.

pomegranited commented 1 week ago

Thanks @bradenmacdonald .

I wouldn't spend much time on [Django Admin].

Cool -- that drops us down to two tickets for this change. I've updated the description here to describe the backend changes needed.

We can just make all collection deletions async in terms of updating the items.

That's the problem -- currently the CONTENT_OBJECT_ASSOCIATIONS_CHANGED event handler updates the search index synchronously, because each individual update is small. If we refactor the event handlers like I've proposed, we'll be sending a CONTENT_OBJECT_ASSOCIATIONS_CHANGED event for each component in a soft or hard-deleted collection, and they'll all be handled synchronously too. I don't currently have a way to send events asynchronously (or to indicate in the event data that this event should be handled asynchronously).

Still ok to leave that part out of scope here?

bradenmacdonald commented 6 days ago

@pomegranited I think we can leave it out of scope for now and plan in the future to include some data in the event to indicate if it's part of a mass change like a mass-delete that should be handled async.

bradenmacdonald commented 6 days ago

@pomegranited CC @rpenido Something cool I just learned about: The newest version of Meilisearch has a (beta) feature that lets you use a function to update documents. So for example, you can say "filter by: type = block AND collections includes collection 15" and "apply function: collections = collections.remove(15)"

See "Experimental: Edit documents with a Rhai function" at https://github.com/meilisearch/meilisearch/releases/tag/v1.10.0

We may want to think about a way to make that sort of update possible in the future, because I bet you it's a lot faster than re-indexing every document fully. But of course, it only works if you're using Meilisearch and we may need to support other search engines in the future.

rpenido commented 5 days ago

That's great!

So for example, you can say "filter by: type = block AND collections includes collection 15" and "apply function: collections = collections.remove(15)"

Do you think it is worth starting to use it for our CONTENT_OBJECT_ASSOCIATIONS_CHANGED event, or should we at least wait for it to be a non-experimental feature?

The delete and update events will be super fast (we don't even need to iterate the objects).

bradenmacdonald commented 5 days ago

@rpenido I think we should wait until the feature is stable. Meilisearch is developed fairly fast so I would expect them to stabilize it in their next release.

pomegranited commented 1 day ago

@bradenmacdonald We've got an implementation working for this feature, but there's an issue with the Django Admin: if you modify a collection from the LMS Django Admin, the Studio Search handlers don't get run (because content/search isn't installed in the LMS), and so the search index doesn't get updated. But if you modify the collection from the Studio Django Admin (which not everyone even knows they have), then it works just fine.

I think this will be resolved in production where people use the event bus? But it's an issue in dev and where there's no event bus.

Is this ok for now?

bradenmacdonald commented 23 hours ago

@pomegranited Can we add a little notice that appears in the Django admin, warning users about it? We certainly don't need to fix that as we don't really need to support changes in the django admin. It's just nice to have. But we might as well put a notice into the admin HTML if we can, at the least to save developers and admins some confusion.

pomegranited commented 20 hours ago

@bradenmacdonald

Can we add a little notice that appears in the Django admin, warning users about it?

Yep I can on the edit page, but can't do anything without custom templates on the list page.

image

bradenmacdonald commented 10 hours ago

@pomegranited That's fine! Thanks.