yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
9.05k stars 1.08k forks source link

CDC without using Kafka #2513

Open joeblew99 opened 5 years ago

joeblew99 commented 5 years ago

about CDC. It would be awesome if the architecture ( and in the future the docs ) allowed a more neutral way to tap into the avro schema types and to integrate with different message queues.

Cockroach dB uses s to and pushes to Kafka also. So it's also not great.

So sure there is not standard for open service brokers ( there is but it's labourious ), so I would have through its possible to provide a neutral CDC feed system that any developer can tap into and write the data into whatever message queue like system they use.

For me it would be NATS streaming server. https://github.com/nats-io/nats-streaming-server

But even liftbridge is also fine. I note that liftbridge is pure grpc based and so more neutral in that respect. https://github.com/liftbridge-io/liftbridge

Anyhow I think you understand the intent of this issue I would be happy to work with the team on this as I like yugabytedb. I also am running CRDB too but don't like its complexity.

ndeodhar commented 5 years ago

Thanks for the feedback and interest, @joeblew99 !

That's our ultimate goal - to be able to provide a generic framework that can leveraged across different application stacks. Since Kafka is widely used, we started our Beta with Kafka. Our immediate next step is to provide a sample console app which app developers can then use as a reference to build their own CDC sinks: https://github.com/yugabyte/yugabyte-db/issues/2351

You've provided some good suggestions and we'll look into those.

cawfeecoder commented 4 years ago

@ndeodhar I second this request. I'd love to have an agnostic way to plug into other message brokers like NATS.

Rkiouak commented 4 years ago

Are there any plans to expose or allow the option of exposing a vanilla http2 based grpc endpoint of the CDCService, say on a different port of the yb-master servers?

I'm referring to this service definition in: https://github.com/yugabyte/yugabyte-db/blob/master/src/yb/cdc/cdc_service.proto#L44

service CDCService {
    rpc CreateCDCStream (CreateCDCStreamRequestPB) returns (CreateCDCStreamResponsePB);
    rpc DeleteCDCStream (DeleteCDCStreamRequestPB) returns (DeleteCDCStreamResponsePB);
    rpc ListTablets (ListTabletsRequestPB) returns (ListTabletsResponsePB);
    rpc GetChanges (GetChangesRequestPB) returns (GetChangesResponsePB);
    rpc GetCheckpoint (GetCheckpointRequestPB) returns (GetCheckpointResponsePB);
    rpc UpdateCdcReplicatedIndex (UpdateCdcReplicatedIndexRequestPB)
    returns (UpdateCdcReplicatedIndexResponsePB);
}

It would be awesome to be able to invoke a grpc call such as rpc StreamChanges (CreateCDCStreamRequestPB) returns (stream GetChangesResponsePB)

nvcnvn commented 4 years ago

Any example on implement your own connectors? What is the difference if any between Kafka based CDC vs custom connectors?

I assume for Kafka based CDC, as long as the cluster is functional I will get at least one delivery guarantee?

How about for custom connectors case? what if the connectors crash (I assume connectors in this context is an external client subscribe to something like a trigger or gRPC stream), how the connectors know where to begin again?

Bessonov commented 4 years ago

Never used it, but may be worth to take a look: https://debezium.io/

cawfeecoder commented 3 years ago

What's the current status on this?

nvcnvn commented 3 years ago

Never used it, but may be worth to take a look: https://debezium.io/

Just look at it, it seems also rely on plugin like pgoutput or wal2json then push to Kafka, the mentioned plugin won't work with Yugabyte I think.

ma-hartma commented 2 years ago

Seems like the preconditions for this are being worked on!

NATS Streaming has been deprecated in the meantime and replaced with NATS Jetstream, which would be awesome to have as a connector/sink: https://docs.nats.io/nats-concepts/jetstream

gedw99 commented 1 year ago

https://natsbyexample.com/examples/integrations/debezium/cli