strapi / rfcs

RFCs for Strapi future changes
68 stars 33 forks source link

Request — Soft-delete entries #13

Open Aurelsicoko opened 4 years ago

Aurelsicoko commented 4 years ago

This issue has been opened because we (as the core team) are looking for contributors to help us co-develop the feature.

Link to the roadmap

Motivations

In several cases, it might be interesting to not really delete the entries in databases but only make them unavailable from a user perspective.

Tasks

  1. Find a contributor
  2. Define the needs
  3. Discuss a solution and technical implementation
  4. Ask for design
  5. Submit RFC

Risks

ionut-botizan commented 4 years ago

I've always been a supporter of having some form of "soft delete" as the default action everywhere where a "delete" operation is allowed, however, the performance hit can be a real issue, depending on the implementation.

I'd just like to make a suggestion and that is to maybe implement this feature as a "Recycle Bin" / "Trash Can" kind of thing that could be enabled for any type of entities. Don't just flag an entry as "deleted" and leave it in the original table/collection/data set; instead have a special storage where all deleted items are kept, along with "meta" information about them (entity type, deletion date, etc.). This way, the performance won't be affected during normal operation and it would only be a concern when specifically querying for deleted items.

vynarim commented 3 years ago

Having a recycle bin is very important because in real life, contributors make mistakes and delete wrong content. It is difficult to explain customer we have to restore database because a couple of content were accidentaly deleted

A solution si to put deleted content in a recycle bin with a restore action and a retention duration of 30 days as example. After this delay it will be automatically deleted for real

sundowndev commented 3 years ago

I'd just like to make a suggestion and that is to maybe implement this feature as a "Recycle Bin" / "Trash Can" kind of thing that could be enabled for any type of entities. Don't just flag an entry as "deleted" and leave it in the original table/collection/data set; instead have a special storage where all deleted items are kept, along with "meta" information about them (entity type, deletion date, etc.). This way, the performance won't be affected during normal operation and it would only be a concern when specifically querying for deleted items.

@ionut-botizan Sorry but I don't understand your point. Having a deleted_at date field that is nullable, and querying entries where deleted_at is null would be very easy to implement with low impact on performances, don't you think ? Having a "Recycle Bin" table would require a lot of logic to restore a deleted entry or querying deleted entries, that is where the performance impact take place IMHO.

ionut-botizan commented 3 years ago

@sundowndev You're looking at it the wrong way. The primary use case would be querying for the active items, not for the ones that were deleted. In the case of a deleted_at field, all the queries in your consumer app (and even the CMS) will need to add a WHERE deleted_at IS NULL clause. Having an index on the published_at field would normally cause the engine to use that index for sorting (thus improving performance) when you just want the latest x items, but adding that IS NULL clause will likely cause the engine to do a full table scan anyway because most of the records will have that deleted_at field as NULL, slightly affecting the performance for the end users.

Other arguments I would bring against a deleted_at field are:

Arguments for a recycle_bin table:

Querying for deleted items would be just as simple as SELECT ... FROM recycle_bin WHERE entity_type = "something". Restoring would be just a matter of doing something like (pseudo-code)

serializedData = query('SELECT entity_data FROM recycle_bin WHERE id = "x"')
fieldValues = unserialize(serializedData)
query('INSERT INTO some_entity_table VALUES ($fieldValues)')
query('DELETE FROM recycle_bin WHERE id = "x"')

In my opinion, this is just a little bit of extra logic when compared with a simple UPDATE on the deleted_at field while the performance penalty would be insignificant. Also, I would not try to optimize for these rare, edge cases (realistically, how often will an item be restored?), but optimize for the most common cases, such as (like I said before) the case where you query for live data from the database.

sundowndev commented 3 years ago

@ionut-botizan Thanks for explaining your point. The problem I have with this recycle_bin design is type safety and data restoration.

How do you restore data from recycle bin? You store it in a JSON field? What happen if you migrate a table and then restore a soft deleted entry? You obviously can't migrate data in the JSON field. You could create a dedicated recycle bin for each table with soft deletion enabled. But that'd require duplicate database migrations.

Soft deletion is not only about preventing data loss, it can be data consistency when you have foreign keys for example (and strapi does not handle cascade deletion pretty well). It can be required when you want defer the deletion to a background job (although it's not supported by strapi).

You leave unnecessary trash in your tables

You're right, but I don't see that as a problem. As data may be restored at any time, you need to keep the deleted entries sync with your schema.

You add metadata fields that are not relevant to your schema (deleted_at, maybe you also want a deleted_by and who knows what else you might end up needing). This way your schema ends up being more complex than it needs to be.

If I needed that kind of information I'd think about another way of handling soft deletion. Because it gets really specific to the application design.

You make purging deleted items after a set amount of time unnecessarily complex. You will first have to determine which of your entity types have the soft delete functionality enabled in order to know on which tables you need to run the DELETE query

To solve that I think I'd add an internal CRON job for each entity with soft deletion enabled that make the purge. The user could choose the interval (or no interval to never purge). It'd allow the user to enable and configure purge for each entities independently.

IF you ever want to implement a CMS wide section where you want the admins/editors to view all the deleted items in one place, you'll need to do the same thing as for purging: first determine which entities use soft delete in order to UNION all those tables to build the deleted items list

It sounds weird to me to have a list of mixed entities in one place, I'd usually want to see the list of deleted items for a single entity at once instead.

My conclusion about that topic is: soft deletion is hard and always requires sacrifices (performances, UX, ...). I also found a good article about that https://medium.com/galvanize/soft-deletion-is-actually-pretty-hard-cb434e24825c

rachasakr commented 2 years ago

As a developer who use Strapi since the beta day, I'd like to jump into this conversation here.

First of all, soft delete is a feature in Strapi v3 already (with Bookshelf). Removing this feature without notice in v4 is ridiculous IMHO. How can they "migrate" with the feature removed I wonder?

Second, regarding metadata fields, you should not forget that we also have both *_at and _by fields integrated into the schema already (and they're required fields by v4 ). Adding deleted_at shouldn't be that much problem since it's a timestamp anyway.

Third, for some implementations that might not have a complex logging system, having deleted_at can be a good piece of evidence for investigating or using it as a "trail" to looking for access logs and else.

Fourth, don't forget that the table can be indexed in other shapes or forms too. If there's a concern about the performance issue, the index with nullable fields can be made to improve the performance. Especially in this day of age where the storage is so cheap, you can buy gigabytes for a cent, concerning unnecessary data with 8 bytes field shouldn't be a major concern anyway.

And lastly, some implements might not want to drop the foreign field's relationship with the deleted entity. Having soft delete means once the entity is "deleted", the relationship that other entities might rely on can still rely on and it can be restored later.

derrickmehaffy commented 2 years ago

First of all, soft delete is a feature in Strapi v3 already (with Bookshelf). Removing this feature without notice in v4 is ridiculous IMHO. How can they "migrate" with the feature removed I wonder?

We didn't remove anything other than bookshelf, it was not a feature we developed nor intended thus we couldn't notify anyone of anything if we were not aware it existed since we didn't add it ourselves. There are hundreds of bookshelf plugins that cannot be used in the v4 since we opted to write our own ORM (it's not really an ORM but close enough in this respect).

The point of these RFCs is for those to make suggestions before a feature is built so that some agreement can be made as to how a feature should be built and function.

kevinvugts commented 7 months ago

Is this conversation still hot topic for v5 roadmap? @derrickmehaffy

I think having a perm delete in real world applications isn't an option anyway. And if it is, it's probably just for some really simple app/api. Most of the companies want (historical) data. So soft deletion of records would have my preference in the first place. Or at least some way of having a historical data layer so you won't lose any data unnecessary.

derrickmehaffy commented 7 months ago

Is this conversation still hot topic for v5 roadmap? @derrickmehaffy

I think having a perm delete in real world applications isn't an option anyway. And if it is, it's probably just for some really simple app/api. Most of the companies want (historical) data. So soft deletion of records would have my preference in the first place. Or at least some way of having a historical data layer so you won't lose any data unnecessary.

No probably not and I would also strongly disagree that hard delete use-cases are small.

In any case there is a community plugin for soft deletes: https://www.npmjs.com/package/strapi-plugin-soft-delete

kevinvugts commented 7 months ago

Is this conversation still hot topic for v5 roadmap? @derrickmehaffy I think having a perm delete in real world applications isn't an option anyway. And if it is, it's probably just for some really simple app/api. Most of the companies want (historical) data. So soft deletion of records would have my preference in the first place. Or at least some way of having a historical data layer so you won't lose any data unnecessary.

No probably not and I would also strongly disagree that hard delete use-cases are small.

In any case there is a community plugin for soft deletes: https://www.npmjs.com/package/strapi-plugin-soft-delete

Could you please elaborate more on why you think hard delete use-cases aren't smaller than soft delete use-cases? In a world were data is key?

I think this should be part of the core since soft-deletion makes sure that you build up historical data, which is a lot of cases, if not all cases, is very important for companies. A simple setting can allow the user to opt out of this default logic if you'd ask me.

derrickmehaffy commented 7 months ago

I've been around in the community since 2017, joined the company in 2020 and since both of those dates I've seen maybe 10 or 20 use-cases where it made sense.

In terms of our customer base I can't think of any enterprise customer who has asked about soft delete except for one or two but it wasn't a hard requirement.

Not saying the feature wouldn't be useful but in terms of priorities there are a lot of other features higher on that list like content versioning.

kevinvugts commented 7 months ago

Okey, that's fair enough. You guys know more about your existing customers. But please don't take your customer base as the single source of truth.

I am aware of literally hundreds of websites/apps/tools that are backed by Strapi and that needs such a functionality. Some of them under my control. Some of them from agencies I sometimes work for.

And that's just me. I can't believe there are not a whole lot more people really wished this was part of the core of Strapi. As Strapi stands for flexibility in mind and being an unopinionated open source software, it think users should have the choice to opt-in or opt-out of it.

And of course, that's partly possible by leveraging the plugin you have mentioned. But honestly, I don't think that's the way to go. Since you'll end up as a software like WordPress. Which requires you to have hundreds of plugins to extend the core which kills your website performance. :-)

I'm glad to see you don't say the feature wouldn't be useful. And I am fully aware that this is not something that is really high on your teams priority list atm. All I am trying to say is: please consider it. Perhaps in an internal meeting if you will?

Thanks!

derrickmehaffy commented 7 months ago

Okey, that's fair enough. You guys know more about your existing customers. But please don't take your customer base as the single source of truth.

I am aware of literally hundreds of websites/apps/tools that are backed by Strapi and that needs such a functionality. Some of them under my control. Some of them from agencies I sometimes work for.

And that's just me. I can't believe there are not a whole lot more people really wished this was part of the core of Strapi. As Strapi stands for flexibility in mind and being an unopinionated open source software, it think users should have the choice to opt-in or opt-out of it.

And of course, that's partly possible by leveraging the plugin you have mentioned. But honestly, I don't think that's the way to go. Since you'll end up as a software like WordPress. Which requires you to have hundreds of plugins to extend the core which kills your website performance. :-)

I'm glad to see you don't say the feature wouldn't be useful. And I am fully aware that this is not something that is really high on your teams priority list atm. All I am trying to say is: please consider it. Perhaps in an internal meeting if you will?

Thanks!

I helped the user build the plugin for soft delete so I know there is a use-case for it but we won't be able to cover every use-case. Not saying we won't build this in the future but probably not in the near to medium term future as our internal roadmap for 2024 is pretty set and focused.

The plugin system is exactly for cases like this. A plugin like this vs built in the core won't change performance unless it's poorly built which could be the case even if we built it that's just the nature of web dev.

kevinvugts commented 7 months ago

Okey, that's fair enough. You guys know more about your existing customers. But please don't take your customer base as the single source of truth. I am aware of literally hundreds of websites/apps/tools that are backed by Strapi and that needs such a functionality. Some of them under my control. Some of them from agencies I sometimes work for. And that's just me. I can't believe there are not a whole lot more people really wished this was part of the core of Strapi. As Strapi stands for flexibility in mind and being an unopinionated open source software, it think users should have the choice to opt-in or opt-out of it. And of course, that's partly possible by leveraging the plugin you have mentioned. But honestly, I don't think that's the way to go. Since you'll end up as a software like WordPress. Which requires you to have hundreds of plugins to extend the core which kills your website performance. :-) I'm glad to see you don't say the feature wouldn't be useful. And I am fully aware that this is not something that is really high on your teams priority list atm. All I am trying to say is: please consider it. Perhaps in an internal meeting if you will? Thanks!

I helped the user build the plugin for soft delete so I know there is a use-case for it but we won't be able to cover every use-case. Not saying we won't build this in the future but probably not in the near to medium term future as our internal roadmap for 2024 is pretty set and focused.

The plugin system is exactly for cases like this. A plugin like this vs built in the core won't change performance unless it's poorly built which could be the case even if we built it that's just the nature of web dev.

Understood! Perhaps, if you have time, you could elaborate more on this open issue? It seems that currently the soft deletion is only applied to collection types from outside plugins. Also population still populates the data even though they are soft-deleted.

https://github.com/ChristopheCVB/strapi-plugin-soft-delete/issues/5#issuecomment-1847390555