agentm / project-m36

Project: M36 Relational Algebra Engine
The Unlicense
892 stars 48 forks source link

implement deferred constraints #310

Open agentm opened 2 years ago

agentm commented 2 years ago

Discussed in https://github.com/agentm/project-m36/discussions/307

Originally posted by **farzadbekran** October 19, 2021 I was thinking it might be a good idea to have some way to delay constraint checks until `commit`. Here is an example scenario: Imagine I have RelVar `a` and `b`, now RelVar `b` has an attribute that is foreign key from `a`. Now if I want to make a change to the attributes of `a`, I need to `undefine` `a`, make changes to the attribs and update the tuples accordingly and redefine it. But since `b` depends on `a` existing, this fails as I can't simply `undefine` `a`, which means I have to `undefine` constaints for `b` to be able to change `a`, and if we have a RelVar `c` which depends on `b`, we have a nightmare on our hands! I assume the is no easy way to determine what constraints reference `a`, so that it can be somewhat automated or something. But if I can disable constraint checks temporarily, I can make my changes to `a` without having to `undefine` constraints for `b` and `c` in the process. Hope this makes sense! This would come in handy specially if I need to make a quick change in production.
agentm commented 2 years ago

C.J. Date in "Database In Depth" around page 126 discusses his multiple assignment operator (the comma) and how is differs from deferred execution, allowing various whole statement optimizations to still apply while not requiring constraints deferred all the way to commit time. It's worth considering.

YuMingLiao commented 2 years ago

I assume the is no easy way to determine what constraints reference a, so that it can be somewhat automated or something.

A constraint is a RelationalExpr data type. Maybe things like uniplate could find out what relvar it depends on.

agentm commented 2 years ago

Indeed, the current constraint optimizer uses such information to determine which constraint checks can be skipped (if the update expression mentions relvar x and the constraint does not, then the constraint can be skipped), so that part of the implementation is covered.

boogerlad commented 2 years ago

Is this issue about implementing deferred constraints or multiple assignment operator?

agentm commented 2 years ago

Ultimately, deferred constraints and the comma operator serve the same purpose. The advantage of the comma operator is that it doesn't actually defer constraint-checking. Date explains:

s:= ..., p:=...;
* First, all of the expressions on the right side are evaluated.
* Second, all of the constituent assignments are then executed in sequence as written [thus triggering the constraint checker].

...since multiple assignment is considered to be a single operation, no integrity checking is performed "in the middle of" any such assignment.

Unlike deferred constraints (often checked at commit-time), the comma operator allows the user to have more fine-grained control over specifically where constraint-checking happens. The comma operator could be implemented in Project:M36 without any special state (such as holding onto which constraints should be validated at commit-time).

boogerlad commented 2 years ago

Thanks! I did read page 126 but wasn't sure which direction Project:M36 would take. This would also allow for true 1:1 relationships between tables instead of 1:[0,1] at best without deferred constraints.

agentm commented 2 years ago

Indeed, the solution also appeals to me because it's a feature which doesn't exist in any other database and seems to be a slick solution to a common problem. SQL just doesn't have relational assignment or equality as concepts, but these are fundamental to the algebra.