Open julienperret opened 2 years ago
@Hydrocharged committed to have this feature supported in foreign keys v2 delivered by end of month.
I spoke to the team and it looks like I made an error in judgement.
The SQL engine that Dolt uses is go-mysql-server(GMS), and I'm working on a full foreign key rewrite that will bring cross database support for foreign keys. Dolt, as an integrator of GMS, will syntactically support cross-database foreign keys, however how Dolt manages multiple databases within a server context is at odds with the referential integrity provided by foreign keys.
In simple terms, Dolt stores each database as its own repository, and creating a foreign key across repositories will cause issues with many of our versioning features as many of them do not support multiple databases. Also, due to their separate commit histories, there is no way to enforce referential integrity when changing to a different branch, reverting a commit, etc. So although there will be a "technical" form of support in Dolt, it may be in our best interest to disable cross database support for foreign keys.
If you do not require that each database belong to its own repository, then we may be able to have a special database statement that essentially creates a new namespace within the same commit history, but that's something I'd have to discuss with the team (assuming it is something you would want).
There are a couple ways we could move forward with this requirement, but both of them require more work than Daylon originally thought.
1) Support multiple DBs in the same commit history. This is on our road map here: https://github.com/dolthub/dolt/issues/3043
2) Defer cross-DB foreign key enforcement unless all necessary DBs are present in a local checkout. We would need to think about how this works with remotes, when to do the enforcement during clones / pulls, etc. The problem is that many dolt command line commands (e.g. dolt pull
) currently operate on a single database, and in order to enforce cross-DB foreign key constraints we would need to expand these operations to include many more databases, figure out what to do if a violation were introduced, etc.
3) Create cross-db foreign key constraints, but do not enforce them. They would be informational only.
If you could tell us a little more about your use case, it would help us plan and decide how to tackle this request.
Thanks,
Zach
The use case in an application that allows to edit multiple independant datasets. The datasets being independant, we would rather keep the commit graphs independant.
The need for a foreign key is to have a parent/"common" dabatase (with its own commit graph) is simply to have an overview of all datasets and a shared primary key for datasets. It might not be the best way to do this with dolt though...
We are open to suggestions!
Thanks, Julien
Thanks for getting back to me. I can think of a couple ways forward.
Depending on the size of the common database and how often you expect it to change, a reasonable solution might be to just embed it in each child database, which would mean that the foreign keys are all within the same database. Then ship updates to the children periodically by dropping and re-creating the copy using the data from the parent main
.
If that doesn't work for you, a solution relatively near to hand would be to support cross-db foreign keys as informational only during database operations, and to implement an out-of-band way to check the consistency if all dbs are present. We have similar functionality already with the dolt constraints verify
command, which is used to verify if foreign key constraints are actually valid in the database (since it's possible to disable them during a session).
The other way forward is to put everything in the same database / commit graph so the constraints are self-contained. This again depends on the size of your data and what workflow you are optimizing for that makes multiple DBs attractive to begin with. Some ideas of how that could work:
1) Have a branch or a fork called main
that contains the common table data, and no other data
2) Have a branch or a fork for each child database, based off the parent
3) When the common DB changes, merge those changes into each child
Whether to use forks or branches depends on the size of the and how you intend to structure your application. With branch, every client gets all the data when they clone or pull. With forks, they get only the data on that fork when they clone or pull. Either solution lets people manage multiple branches based off whatever they consider their base branch.
Let me know what else I can tell you. Also feel free to bug me more on discord. Let's figure out how to get you up and running.
The following does not work:
Using dolt sql:
It produces the error:
But still creates a table without the foreign key:
Produces:
Am I missing something?
Tried with dolt version 0.36.0 and 0.37.6.
PS: Yes, my example is very much inspired by #1104