Open miretskiy opened 1 year ago
cc @cockroachdb/cdc
I'm not sure what the issue is, but I can help solve it.
When you start CDC export (e.g. CREATE CHANGEFEED INTO s3... WITH initial_scan=only
), the only way to know when it completes is to query the job status. But, you may want consumers not to be dependent on that (maybe they don't even know the job id). You may want consumer to be able to determine if the export finished by looking at s3 bucket/directory. And right now, it's very hard to tell. So, the idea would be to emit a marker file "export.done" or some such to indicate that export completed, so that the consumer can simply watch the directory until file shows up.
One way to accomplish this functionality is to allow "resolved" option to be used when initial_scan=only option specified.
According to the documentation there are multiple sinks and s3
is one of them. How should we do this "done" indicator with each sink?
Every sink supports the same interface. For example, EmitResolvedMessage emits resolved message into any sink. For file based sinks, it writes out a file, for message based sinks (kafka, etc) it sends a message. This is similar.
It could be useful if CDC export emitted "done" indicator. This may be as simple as allowing "resolved" option to be used w/ export so that final "resolved" marker file/message is emitted before changefeed terminates.
Jira issue: CRDB-31703