Nozbe / WatermelonDB

🍉 Reactive & asynchronous database for powerful React and React Native apps ⚡️
https://watermelondb.dev
MIT License
10.6k stars 597 forks source link

RFC: New delete API #1016

Open radex opened 3 years ago

radex commented 3 years ago

RFC: New delete API

Goals:

Basic API

await model.remove()

This, by default, marks as deleted the model and all of its descendants.

remove() can only be called inside a writer action.

remove is used to avoid using delete, because it's a JS keyword. (I'm open to feedback, we can pick another word).

UPDATE: Unless I'm missing something, delete is only a keyword contextually so we actually can name the method delete()

Marking as deleted / destroying permanently:

Overwrite the default by explicitly stating the desired behavior:

await model.remove({ permanently: true }) // destroys permanently (actually removes from database)
await model.remove({ permanently: false }) // marks as deleted (so that deletion can be synced later)

Non-syncable apps

If the app does not use sync (is local-only), the Database instance can be configured to remove permanently by default:

const db = new Database({
  ...,
  // Not sure how to call this parameter, two ideas:
  removePermanentlyByDefault: true, // explicit name
  syncable: false, // more generic, could also control other behavior differences (can be a pro or con, not sure)
})

In that case, remove() destroys permanently by default.

I t would probably be possible to override it with { permanently: false }, but I'm not sure there are use cases for that -- maybe it's better to treat { syncable: false } as an optimization that skips keeping track of sync state altogether - WDYT?

I'm not entirely sure if this is the right design. I'd rather Model methods not have knowledge about Database configuration, and behave differently based on it. It feels wrong to me. And any library code to work with WatermelonDB would have to explicitly declare whether a remove is meant to be permanent or not.

On the other hand, unless you're syncing, using raw adapter methods, or unsafe raw SQL queries, the difference in behavior should be invisible - in both cases a removed record becomes invisible and unusable.

Also, for non-syncing apps, the need to specify { permanently: true } on every call would be very inconvenient and a regression (instead of an improvement) from just calling destroyPermanently() every time.

WDYT?

Descendants

By default, remove() removes all the record's descendants (children and their children, etc).

You can override this behavior and (possibly unsafely, depending on the app's semantics) force it to only remove the individual record:

// NOTE: I'm open to suggestions about this API, `descendants` is a confusing word for non-fluent English speakers
await model.remove({ descendants: false })

By default, descendants are determined using Model.associations. All tables that a model has_many of is considered to be descendant. For example:

export default class Comment extends Model {
  static associations = associations(
    ['tasks', { type: 'belongs_to', key: 'task_id' }], // parent
    ['attachments', { type: 'has_many', foreignKey: 'parent_id' }], // child (descendant)
  )
}

Removing a comment will also delete all attachments where attachment.parent_id == comment.id.

This behavior can be overriden by implementing static childrenToRemove:

export default class Comment extends Model {
  static childrenToRemove(ids) {
    // ids is an array of RecordIds of comments that are about to be deleted
    // return a list of queries that find all descendants that need to be deleted, like so:
    return [
      {
        table: 'attachments',
        query: Q.and(
          Q.where('parent_id', Q.oneOf(ids)),
          Q.where('parent_type', 'comment')
        ),
      },
    ]
  }
}

Note that this method isn't called on every comment to delete, but instead, it's called once on the Comment table to determine all children of all comments that are about to be deleted. If the returned record queries also have descendants, Watermelon will recursively call all their childrenToRemove(). This way, fast tree deletion can be achieved with multiple descendant levels, and it could also be extended to fast deleting a result of a query, not an individual record.

When childrenToRemove() is specified, associations are not consulted, so to override, you have to reimplement getting all children.

I'm also considering changing the definition so that instead of a simple static definition of children, an asynchronous function that returns actual records (not just a query) is used. This would be more powerful and might be necessary for some more complex app schemas, but it might also prevent some advanced optimizations. WDYT?

TODO: What about models that form cyclic graphs (a record can both have many and belong to records of the same type)? It would be easy to create an infinite loop without some special precautions. Do we need to worry about it? It's a relatively uncommon use case and Watermelon doesn't have great built-in support for this anyway, so maybe apps that need it can just worry about it on their own?

Query deletion

This removes all records matching the query:

await query.remove()

Again, { permanently: true/false } and { descendants: true/false } can be used to override the deletion behavior, and existing query.markAllAsDeleted() and query.destroyAllPermanently() are deprecated and will be removed.

"On delete" action

Sometimes, deleting a record triggers some additional action in addition to deleting its descendants.

In Nozbe Teams, there's one such case: Deleting a ProjectSection (a thing that organizes a Project into smaller chunks) should not delete the Tasks that belong to it - instead, they should simply be changed not to belong to any section.

We could create an API that goes something like this:

export default class ProjectSection extends Model {
  static onRemove(database, ids) {
    return database.write(async writer => {
      const tasks = database.get('tasks').query(Q.where('project_section_id', Q.oneOf(ids)))
      await writer.batch(
        tasks.map(task => task.prepareUpdate(() => {
          task.projectSection.set(null)
        }))
      )
    })
  }
}

However, there's a complication: if we want to remove the entire Project, we want to remove all its descendants, including all its tasks, and sections. We don't want sections' tasks to be moved to "without section", because they also get deleted. So how do we prevent this onRemove action to be called without breaking other use cases? When do we wait for onRemove to be executed? How do we ensure that onRemove doesn't change or remove records that we've just deleted/are about to delete? How would that work with Batching?

This makes me doubt that we want to have an "on delete" API anyway. I think in this case, the simplest thing to do is to create a custom removeSection method with this special behavior we can call on an individual section, and prevent users from accidentally calling section.remove(). This would preserve desired behavior for deleting the entire Project.

WDYT? I'm curious to hear other apps' use cases, maybe we do need some generic mechanism for this after all.

Batching

Currently, we can call record.prepareMarkAsDeleted() / record.prepareDestroyPermanently() to be able to make that change transactionally in a single batch with other changes. A requirement of prepareX methods is that they must be called synchronously with batch()… while remove() must be asynchronous.

This complicates the API, since we must split the remove operation into an asynchronous preparation of the list of records to remove, and a synchronous marking of those records.

Here's a few ideas:

// 1:
const preparedRemove = await record.prepareRemove({ ... })
await database.batch(...preparedRemove.prepare())

// 2:
const prepareRemove = await record.remove({ prepare: true })
await database.batch(...prepareRemove())

// 3 - preparedRemove here is a magic object that database.batch() understands and _it_ changes these Models
const preparedRemove = await record.prepareRemove({ ... })
await database.batch(preparedRemove)

WDYT?

Do we even need this? Does anyone care about running this remove in a single batch with other changes?

Deprecations & removals

The following APIs would be deprecated and removed in a later version:

Model.prepareMarkAsDeleted
Model.prepareDestroyPermanently
Model.markAsDeleted
Model.destroyPermanently
Model.experimentalMarkAsDeleted
Model.experimentalDestroyPermanently
Query.markAllAsDeleted
Query.destroyAllPermanently
radex commented 3 years ago

@kokusGr @michalpopek @rozPierog & all outside contributors -- feedback would be appreciated! ;)

kokusGr commented 3 years ago

I'm not entirely sure if this is the right design. I'd rather Model methods not have knowledge about Database configuration, and behave differently based on it. It feels wrong to me. And any library code to work with WatermelonDB would have to explicitly declare whether a remove is meant to be permanent or not.

Other solution could be to add a static property static removePermanentlyByDefault = true when defining a new model. This way model behaviour doesn't depend on db configuration and user even can specify only some models as removed permanently by default. The only drawback is that you have to remember about it when declaring a new model, but that doesn't happen that often.

OTOH I don't think that depending on DB config is that wrong.

descendants is a confusing word for non-fluent English speakers

We could use children even though it's not technically correct, but I think users will be fine with descendants :)

I'm also considering changing the definition so that instead of a simple static definition of children, an asynchronous function that returns actual records (not just a query) is used. This would be more powerful and might be necessary for some more complex app schemas, but it might also prevent some advanced optimizations. WDYT?

Could you think of an example when this would be useful? In most cases removing descendants is straightforward. However this also means that you would rarely need to use childrenToRemove method so it might as well be flexible.

"On delete" action

I don't see a reason to add such "hooks". I think adding a removeSection method that implements a custom behavior is enough and could even be superior if for example you need 2 different methods like removeSection and removeSectionWithTasks.

radex commented 3 years ago

@sidferreira @fahrinh @diegolmello @kilbot @henrymoulton

radex commented 3 years ago

Other solution could be to add a static property static removePermanentlyByDefault = true when defining a new model. This way model behaviour doesn't depend on db configuration

Hmm, that is interesting, though I think I'd rather have some extra dependency between model methods and database than to do this -- because it's error prone and might not immediately be easy to spot the error before realizing you lost some data/are accumulating junk data

and user even can specify only some models as removed permanently by default.

that is interesting though - we don't have that need but just recently there were questions about how to exclude a table from synchronizing: https://github.com/Nozbe/WatermelonDB/issues/1027

diegolmello commented 3 years ago

Do we even need this? Does anyone care about running this remove in a single batch with other changes?

We do it. We have an operation to fetch data changed since the last time the user closed the app. The server returns updated and removed records. With updated we do upsert (the best way we can, but it's still too messy) and add to the batch. With removed query and add to the same batch.

radex commented 3 years ago

@diegolmello but in this instance, would you even want to remove descendants of removed objects in a batch? or do you just care about deleting in one batch the exact records specified by your sync?

diegolmello commented 3 years ago

We don't have any logics for descendants. It's only exact records as you said.

sidferreira commented 3 years ago

(still reading a 2nd time... it seems that the whole problem is the freaking delete keyword for JS)

radex commented 3 years ago

the whole problem

hm, what whole problem?

sidferreira commented 3 years ago

@radex After sleeping on this:

Basically remove({ descendants: true, permanently: false }) is the current behavior.

KrisLau commented 2 years ago

@radex After sleeping on this:

  • I rather use delete
  • I agree with static removePermanentlyByDefault = true, but maybe static removeBehavior = DELETE | MARK_DELETED to be a bit shorter but a bit more readable.
  • I suggest cascade as it is part of the SQL lingo, instead of descendants

Basically remove({ descendants: true, permanently: false }) is the current behavior.

@sidferreira How does the cascade delete work for the current implementation? I tried adding { descendants: true } to prepareMarkAsDeleted and it didn't seem to do anything?

fivecar commented 1 year ago

Goals:

  • Unified, simple, nice, understandable API for deleting records
  • Deprecate (and eventually remove) existing markAsDeleted, destroyPermanently, experimentalMarkAsDeleted, experimentalDestroyPermanently methods (which can be confusing)
  • Finally, high performance deletion of large trees of records (currently not possible)

@radex : Your goals (1) and (3) appeal a lot to the scenarios in which I use WatermelonDB. I don't have an informed opinion on (2), and especially on how (2) may or may not be reflected in Models.

Current use of relations/associations often requires that adopters implement descendent deletion whenever they want an object deleted. It seems building this into the API would satisfy the common scenario (where most users, I'm pretty sure, would want descendants deleted when a parent is deleted). The efficiency is also something that'd be a nice add-on.