cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.87k stars 3.77k forks source link

schemachange: Support DETACHED syntax to immediately return schema change job ID #104597

Open Xiang-Gu opened 1 year ago

Xiang-Gu commented 1 year ago

We should support an option, similar to the DETACHED option in changefeed/backup/restore, for schema change that returns the job ID immediately without waiting for the job to complete.

This adds values to customers who want to programmatically track their schema change jobs without having to query the jobs table again, as in, for example, https://github.com/cockroachdb/cockroach/issues/104387#issuecomment-1579385917.

Jira issue: CRDB-28621 Epic CRDB-14892

Xiang-Gu commented 1 year ago

https://github.com/cockroachdb/cockroach/issues/79497 is a previous issue for only CREATE INDEX but I think we should provide such option for all schema changes.

lyang24 commented 1 year ago

Thank you for creating the issue, on parser detached has the definition of detached: execute backup/ restore job asynchronously, without waiting for its completion. Giving the schema changer jobs are already asynchronous I wonder if continuing with DETACHED keyword still makes sense.

Btw many ddl statements are involved with schema changer - I wonder if we can reopen create index with detached and link that as a child issue to this issue. I would like to work on the change that make creating index returning a job id :)

lyang24 commented 1 year ago

detached by it self may not be good enough because customer's program might be encounter failures on persisting the returned job id.

I propose give more freedoms to customer to create a job with specific job id, also with the detached option to return job id. For an example.

// JobManagementOptions give users control with backgroud jobs
type JobManagementOptions struct {
    // If JobId is provided by user crdb will use it to initialze background job
    JobId           *DInt
    // If Detached is set to true crdb will return the jobid as soon as jobs are registered, customers can use it to track the process.
    Detached        bool
}

// CreateIndex represents a CREATE INDEX statement.
type CreateIndex struct {
    Name        Name
    Table       TableName
    Unique      bool
    Inverted    bool
    IfNotExists bool
    Columns     IndexElemList
    Sharded     *ShardedIndexDef
    // Extra columns to be stored together with the indexed ones as an optimization
    // for improved reading performance.
    Storing          NameList
    PartitionByIndex *PartitionByIndex
    StorageParams    StorageParams
    Predicate        Expr
    Concurrently     bool
    Invisibility     float64
    JobManagementOptions JobManagementOptions
}