cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.89k stars 3.77k forks source link

cdc: split_column_family + resolved #79452

Open amruss opened 2 years ago

amruss commented 2 years ago

Right now we will error if you use split_column_families and resolved configs on the same changefeed. This is not ideal, we should allow the use of both configs.

Jira issue: CRDB-14872

Epic CRDB-19180

blathers-crl[bot] commented 2 years ago

cc @cockroachdb/cdc

HonoreDB commented 2 years ago

Technical context: Resolved timestamps are sent to every topic (in Kafka/pubsub). But "every topic" is not well-defined for split_column_families--new topics are added whenever we see a new column family for the first time. So to implement this properly, the change frontier needs to either know about all topics that have been emitted to, by adding some info to the messages they get from changefeed aggregators, or needs to be tapped into the schema feed and constructing the topics the same way the changefeed aggregators are. Either way seems perfectly fine to me.

Workarounds in the mean time: Use the individual family feed syntax instead, or use a different sink that gets "global" resolved timestamps instead of fanning them out to every topic.

mari-crl commented 2 years ago

:information_source: Hello! I am a human and not at all a robot! Look at my very human username! :robot: :notes: :thinking: Although I tried very hard to figure out what to do with this issue, more powerful human brains will need to help me. (specifically: Both Github and Jira issues were modified after exalate problems) :confounded: :arrows_counterclockwise: Please visit this issue's mirror at CRDB-14872 and try to sync the two sides up manually. :star2: :white_check_mark: When you're finished, comment saying as much asn a member of Developer Infrastructure will be along to finish linking. :link: :no_entry_sign: Note that until this is done, this issue is not and will not be synced to Jira with Exalate. :no_entry_sign: :sweat_smile: Feeling lost? Don't worry about it! A member of @cockroachdb/exalate-22-cleanup-team will be along shortly to help! :+1: :construction_worker: Developer Infrastructure members: when ready, open Exalate from the right-hand menu of the mirror issue in Jira, then choose Connect and enter this issue's URN: cockroachdb/cockroach-79452. Either way, delete this comment when you're done. :key: :pray: Thank you for your compliance, my fellow humans! :robot: :wave:

miretskiy commented 1 year ago

@HonoreDB any work around this? Maybe close this issue if we don't have immediate plans (considering we do have a workaround)?

HonoreDB commented 1 year ago

Up to you but I'd rather keep issues like this open in a "long-term backlog" column, the workaround might not be enough for all users and it'd certainly be feasible to solve this, just not worth the effort right now.