cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.09k stars 3.8k forks source link

sql: support COPY ... with FREEZE #85573

Open rafiss opened 2 years ago

rafiss commented 2 years ago

Support the FREEZE option for COPY.

Requests copying the data with rows already frozen, just as they would be after running the VACUUM FREEZE command. This is intended as a performance option for initial data loading. Rows will be frozen only if the table being loaded has been created or truncated in the current subtransaction, there are no cursors open and there are no older snapshots held by this transaction. It is currently not possible to perform a COPY FREEZE on a partitioned table.

Note that all other sessions will immediately be able to see the data once it has been successfully loaded. This violates the normal rules of MVCC visibility and users specifying should be aware of the potential problems this might cause.

https://www.postgresql.org/docs/current/sql-copy.html

Some of this may not be applicable to CRDB.

Jira issue: CRDB-18317

Epic CRDB-18320

rafiss commented 2 years ago

copying over @cucaroach's notes from https://github.com/cockroachdb/cockroach/issues/86792

PG has an optimized version of COPY that works in the following scenario:

  1. COPY has FREEZE annotation
    1. COPY is under explicit transaction
  2. Target table was created under same transaction and isn't yet visible

I don't know about edge cases (like can this transaction read from or do more writes to the table after the COPY? can the table also have secondary indexes?)

I also don't know what the state of CRDB's transactional DDL is but if we had it we could just catch this scenario and hand over the COPY to the bulkio import path I think. @dt @ajwerner thoughts? The rationale here is we want a way to speed up the initial bulk full table copies DMS services do.

mgartner commented 1 year ago

@cucaroach is this still a priority for 23.1?

cucaroach commented 1 year ago

@cucaroach is this still a priority for 23.1?

No, its too limited in applicability, I discussed it in #91831

mgartner commented 1 year ago

Thanks. Moving this to the backlog then.