vaticle / typedb

TypeDB: the polymorphic database powered by types
https://typedb.com
Mozilla Public License 2.0
3.72k stars 338 forks source link

Redesign schema modification capabilities #6981

Closed krishnangovindraj closed 4 months ago

krishnangovindraj commented 5 months ago

Usage and product changes

We redesign schema modification to allow much more flexible in-place changes to the database schema. We relax various schema invariants within a schema write transaction, to allow moving and editing schema types on the fly. However, the data is validated against the schema consistency at each step, allowing full and safe use of TypeDB's existing Concept and Query API. Before committing, we can restore schema invariants guided by TypeDB's exceptions API (ConceptManager.getSchemaExceptions()).

Expected schema migration workflow

This change facilitates large-scale database schema migration. We expect the following workflow to be adopted:

  1. Open a schema session, and a write transaction. This blocks writes anywhere on the system.
  2. Mutate the schema incrementally. Mutations that expand schema are always possible and cheap, mutations that restrict the schema are validated against the existing data for conformance to the new schema. All schema states you move through must match the current state of the data. a. If your data does not fit the new schema state, in 2.x you will get an exception on commit and it will roll back. You must open a data session+transaction to mutate the data into the shape it is expected to be and commit this. Then go back into schema session+transaction and retry the schema mutation. b. In TypeDB 3.0 these operations will be possible all within one schema write transaction, smoothing out the schema migration workflow.
  3. To make schema migration simpler, some schema invariants are relaxed within a schema write transaction: a. Dangling overrides are allowed: overridden types (... as TYPE) are allowed to refer to types that are not overridable at that place in the schema. This is common when moving a type from one supertype to a different supertype. b. Redeclarations are allowed: Declarations of owns, plays, or annotations, may be duplicated in child types. This facilitates moving types from one supertype to a different supertype, or moving declarations up or down the type hierarchy. c. Relaxed abstract ownership: Types may own abstract attribute types without themselves being abstract.
  4. All of these invariants must be restored before commit, or the transaction will fail and the changes will be rolled back. To retrieve the set of errors that must be fixed before commit, use the api ConceptManager.getSchemaExceptions() (transaction.concepts().getSchemaExceptions() in most drivers).

Operations that expand the schema capabilities:

Operations that restrict the schema capabilities:

Implementation

Notes

Only a valid schema can be committed. The validation considers the following properties, which are checked either at 1) immediately, at operation time or 2) deferred (requested via transaction.concepts.getSchemaExceptions or on commit):

Immediate validation:

Deferred validation:

Tese have been delayed to commit time to allow certain schema modifications to be performed with data in place.

Data integrity invariants

There are never instances of types, ownerships or relations which are not allowed in the schema. Violations are:

This is validated when:

vaticle-bot commented 5 months ago

PR Review Checklist

Do not edit the content of this comment. The PR reviewer should simply update this comment by ticking each review item below, as they get completed.


Trivial Change

Code

Architecture