Open postamar opened 3 years ago
Indeed, when a table descriptor goes through the schema changer machinery, we add its ID to all type descriptors whose IDs are in the return value of GetAllReferencedTypeIDs(), which includes the type descriptors' transitive closure.
I was in the backup/restore area debugging something entirely different when I chanced upon an explanation for why this may have been designed the way it was. Consider the case when we're backing up a table that contains a user defined type. If this table is restored into a database which doesn't contain this type we also restore the type descriptor. Note that a type descriptor can't exist without a corresponding array type alias. As such, it looks like we elected to capture the array type descriptor as a "dependency". This if block https://github.com/cockroachdb/cockroach/blob/1684499d8979caea5aeac537f9c8da6d5f225307/pkg/ccl/backupccl/restore_planning.go#L732. is where all this happens.
FWIW, I don't think we necessarily need to rely on backing up the array type descriptor for this though, considering that the array type descriptor can be entirely derived from the actual enum type descriptor.
Consider the case when we're backing up a table that contains a user defined type. If this table is restored into a database which doesn't contain this type we also restore the type descriptor. Note that a type descriptor can't exist without a corresponding array type alias. As such, it looks like we elected to capture the array type descriptor as a "dependency".
Thanks for the explanation! This design choice makes a lot more sense to me now.
This just got pushed onto the critical path, if we're to implement DROP TYPE ... CASCADE
correctly. Also bad things like https://github.com/cockroachdb/cockroach/pull/84305 keep happening because of this lack of validation, so we might as well do it sooner rather than later.
I've unassigned myself considering that I'm very unlikely to close this before going on leave.
There are a few issues related to
ReferencingDescriptorIDs
in type descriptors, all related to how these descriptor references behave differently than others, like foreign key references, etc.Although the name would suggest otherwise,
ReferencingDescriptorIDs
only contains table descriptor IDs. Either rename the field to reflect this, or also include database or type descriptor IDs as well.Unlike all other descriptor references,
ReferencingDescriptorIDs
contains not just direct references but also transitive references. Indeed, when a table descriptor goes through the schema changer machinery, we add its ID to all type descriptors whose IDs are in the return value ofGetAllReferencedTypeIDs()
, which includes the type descriptors' transitive closure. This is useful to quickly check whether a type can be dropped safely or not, but it is unlikely to scale well and has all kinds of ugly implications (what happens when we have record types as well as enum types? etc.). I strongly feel we should stick to direct references, along with providing some useful machinery in the descriptor collection or wherever incatalog
to walk the relational graph to collect all transitive dependencies.Unlike all other descriptor references, there are no back-reference checks done on
ReferencingDescriptorIDs
when validating a table descriptor, for instance. Adding these is not so straightforward, since in some cases these back-references are not added in the same transaction but asynchronously in a schema changer job. Specifically, I tried adding back-reference checks and the CCL logic test suitealter_table_locality
failed with themultiregion-9node-3region-3azs
config due to a validation failure following a ROLLBACK.To address these points I suggest we apply a similar treatment to this field as we did to the old foreign key representation:
referencing_descriptor_ids
field in the type descriptor proto;referencing_database_descriptor_ids
,referencing_table_descriptor_ids
,referencing_type_descriptor_ids
) containing only direct references;RunPostDeserializationChanges
type descriptor builder method implementation to do the same thing;referencing_descriptor_ids
;Jira issue: CRDB-6452
Epic CRDB-31469