cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.95k stars 3.79k forks source link

IMPORT DDL should all go through SQL #67076

Open shermanCRL opened 3 years ago

shermanCRL commented 3 years ago

Is your feature request related to a problem? Please describe.

Our current IMPORT logic handles DDL statements in an incomplete way. Some DDL that CRDB supports in general are not supported at IMPORT time, because IMPORT has its own DDL logic.

Our users experience this as a surprising failure.

Restated, DDL features of IMPORT lag behind CRDB’s general DDL.

Describe the solution you'd like

IMPORT should hand off all DDL logic to SQL.

Epic CRDB-8353

Jira issue: CRDB-8348

ajwerner commented 3 years ago

It's hard to imagine this happening before we move onto the new schema changer infrastructure. Even then, it feels like a pretty big project. Importing descriptors need special handling. Also, import execution interacting with those descriptors needs special handling. I think it'd be very difficult to think about tackling this in the next 6 months.

shermanCRL commented 3 years ago

OK, that’s helpful. I don’t have a strong picture of scope and complexity here.

nihalpednekar commented 3 years ago

Un-assigning from myself in light of Andrew's comment.

dt commented 3 years ago

There's an open question as to whether or not we actually need to do anything special on the schema side. One rough MVP of this is to just hand every non-insert/copy statement over to an internal executor, probably in its own txn but maybe in one big one, and let the current behavior do its thing, then resolve the names of the descs in the insert-into/copy statements to find those new descs, schema change those to offline ala IMPORT INTO, then import.

Obviously this would be a functional change from today, in that descs would be visible to the user mid-job, and if we did nothing special, even public briefly after creation before we moved them to importing, but I think it is still TBD if that is a UX we're okay with (they're actually already visible, offline, mid-job so its more about the brief online state and how we handle rollbacks). We might need a little schema functionality, e.g. a session var that creates new tables in Offline_IMPORTING or something if we don't like the new brief public window, and depending on how we want to handle rollbacks, some what to track which descriptor IDs we created.

With an approach like this, I'm not actually sure that we'd really need all that much new schema functionality and/or need to block on the new schema changer. Is there something specific you were thinking of @ajwerner ?

ajwerner commented 3 years ago

internal executor, probably in its own txn but maybe in one big one,

The internal executor does not support multi-statement transactions that accumulate extraTxnState today; we'd need to upgrade that infrastructure. I worry also about the family of bugs that can happen due to weird interactions between schema change jobs and the import job. How do we handle resumptions? It's doable but it's more complex than I think you're giving it credit for.

depending on how we want to handle rollbacks, some what to track which descriptor IDs we created.

This feels like a big part of the problem.

if we don't like the new brief public window

What makes you say "brief"?


I suppose there's a way to make it work but I'm anxious about it. My instinct is that it feels complex enough for a design doc that would follow on from some initial prototyping.

rafiss commented 2 years ago

The internal executor does not support multi-statement transactions that accumulate extraTxnState today; we'd need to upgrade that infrastructure.

We can revisit this once https://github.com/cockroachdb/cockroach/pull/82477 is done

rafiss commented 2 years ago

cc @ecwall @ZhouXing19 for awareness of the history here

dt commented 2 years ago

more history over on https://github.com/cockroachdb/cockroach/pull/72317