Open grigoryk opened 6 years ago
Before I dig into this, one minor clarification: :db.type/ref
doesn't mean enum. Some ref attributes are used as an enum-like thing, but not all, and enums aren't even necessarily named. Mentat doesn't provide a way to enforce that specific kind of enum, and doing so is actually not as simple as it looks, because you would need to also restrict the space of operations that can be performed on :db/ident
. But I digress.
Let me restate, see if I understand what you're asking.
You're saying that:
:db/cardinality
) are db.type/ref
, and some of those attributes require their ref values to be entities with :db/ident
("enums").:db/ident
attributes to be a restricted set (e.g., {:db.cardinality/one, :db.cardinality/many}
).{:db.cardinality/list}
).You are correct, but it kinda doesn't matter.
There will always be cases in which the meaning of a schema changes over time; we (and Datomic) have picked/inherited a number of common axes that we directly support (cardinality, doc, uniqueness, etc.), and there are others that we don't ("metaness", permanence/transience, etc.).
Even within the set of properties we model in the core schema there are domain-level concepts that can change in a way we don't formally describe.
There are three ways to represent that in the vocabulary system:
:monkey/species
, where the introduction of :species/callicebus_miltoni
in 2015 is A-OK), we can simply begin using the new 'enum case', and older clients will probably behave correctly.:height/very-tall
is a subset of :height/tall
), then we can add a second attribute and keep writing the first. This might well be a data modeling error — we should have recorded :person/height
instead!The version number of the vocabulary exists precisely to model this kind of exclusion, where the vocabulary cannot only be implicitly extended, but needs to be replaced.
I don't think it's all that feasible to model every possible restriction in the schema language itself: after all, there are:
:person/residence
is the domain of entities that have a :country/country_code
… oh, but people can live in places that don't have country calling codes".I used the vocabulary version number to allow developers to indicate that one of these constraints has changed.
(This kind of sophistication is one reason why even complicated SQL databases support triggers and stored procedures to impose computed constraints!)
To return to your question:
The reason you're asking is that merging two databases which have the same core vocabulary version but allow different enum cases for a schema-related attribute (cardinality being one such) will break Mentat.
The weaker version of that scenario is that merging two databases which have the same non-core vocabulary version but use different enum cases for its non-schema attribute might break application code.
You have two choices here.
The first, which is the one I was taking, is to say: don't do that. If you have an attribute that has a limited range of acceptable entities, then when you add another such entity (you'll find yourself writing [:db/add "foo" :db/ident :my/ident]
) you must do exactly the same thing you would do when you make a non-back-compat change to a vocabulary: bump the vocabulary version.
The second is to say that this is something we'll model in vocabulary and track in Mentat. Perhaps :db/type :db.type/closed-enum
, and a way to write out the cases.
If you go this route you will need to validate in the transactor, record those enums before syncing values, decide whether the enum cases can shrink and/or grow… and you still haven't solved the problem, because if the enum set can't change backward-compatibly (and for cardinality it cannot), then the developer still needs to bump the version, or at least handle the case where Mentat complains that the remote timeline has a different enum set.
Ultimately you are bumping into the question of what to do when fundamental change occurs. My position is that we should support indicating, migrating, and detecting, but that we cannot transform all kinds of fundamental change into automatic change.
By the way: whenever the set of supported types is changed (and there are several such changes on the list), we will need to bump the core vocabulary version. Remember that older clients might not even be able to represent those newer types — they might need a different SQLite schema!
Locking out clients will probably be an infrequent event, but we cannot eliminate it entirely.
If core schema is viewed through a lens of multiple mentats figuring out if they're compatible with each other, it seems insufficient. Currently, if two instances agree on the core schema, their transactions are not necessarily compatible with each other.
Quoting @ncalexan, "[when defining vocabularies] we can say a thing needs “to be” an enum (
:db.type/ref
) but we can’t say “and these are the valid enum cases:db.cardinality/*
“. (And the transactor doesn’t handle that value restriction at all.)"@rnewman you've added the core schema initially; do you think it's reasonable to expand it going forward? Do you think having a subset of the bootstrap transaction defined as a special thing is valuable?