Open stevendanna opened 2 years ago
This relates to https://github.com/cockroachdb/cockroach/issues/26476. More generally, our story around GC and dynamic ttls is largely non-sensical.
I've assigned this to myself for the purpose of figuring out what the broader problem is, here, and bringing together various stakeholders to fix it.
First of all, there's the problem that zone configs are defined hierarchically and inherit from their ancestors, but the hierarchy's lifecycle isn't coordinated; even worse the zone config values themselves also determine this lifecyle. This issue is one instance of that: the db's zone config disappears despite there being zombie tables that depend on it, so things go awry. So far I see three potential kind of fixes, but there might be more:
Then, there's the problem that how tables are GC'ed is the result of an accident of history: the GC job relies on the table descriptor and the zone config instead of its behaviour (TTL etc) being defined solely by its own job details.
Somewhat related in https://github.com/cockroachdb/cockroach/issues/26476 there's a legitimate need to soft-delete tables and whatnot by moving them to some trash schema or db. How do we deal with zone configs if we want to do that? Do we want to do that?
Going even further one may wonder what is the purpose at all of a SQL-level GC job, just DelRange
ing the table data and Del
eting the table metadata at the end of the DROP schema change should be enough, now that DelRange
exists (it's MVCC compatible, right? that was the whole point IIRC).
cc @devadvocado: zone configs are one of those things that SQL Schema owns but which are mostly used outside of the team. Unsurprisingly the expected behaviour is poorly defined, especially as it appears between how the zone configs change over time and how the associated schema objects change over time.
I've unassigned myself considering that I'm very unlikely to close this before going on leave.
This might relate to the proposed concept of a "root descriptor" which is parent to all DBs. cc @ajwerner
Describe the problem
If we create a database and a table:
We shortly after see a span configuration for the new table:
If I modify the
baz
database's zone configuration, I expect that the span configuration for the span starting at /114 will be updated. However, if I create this zone configuration and then quickly drop the table and the database:The zone configuration update is sometimes never reflected in the span configuration. I believe that most proximately this is the result of the fact that in DROP DATABASE, we delete the zone configuration immediately if the database does not have any tables:
https://github.com/cockroachdb/cockroach/blob/70e868e6546df5e0d29b40489a0416c7c67f6b9d/pkg/sql/schema_changer.go#L2695-L2710
This isn't a big usability issue, but it does mean that one of backup/restore's attempt to make sure some cleanups happen quickly don't work correctly:
https://github.com/cockroachdb/cockroach/blob/master/pkg/ccl/backupccl/restore_job.go#L2730-L2737
Jira issue: CRDB-19588