moiot / gravity

A Data Replication Center
Apache License 2.0
907 stars 171 forks source link

Outbox pattern #302

Open mehran-prs opened 3 years ago

mehran-prs commented 3 years ago

I need to implement an outbox pattern using gravity. I need to capture the create operation on the outbox collection of our MongoDB database and send it as output to Kafka.

My outbox collection has these fields: kafka_topic, kafka_key, kafka_value, kafka_header_keys, kafka_headers

As you see I need to set kafka key,value,topic,... using these fields in each created outbox document. I'm thinking about implementing a new gravity output which do that for me, How do you think?

Are you ok with a new PR to add kafka_outbox as a new output plugin?

Ryan-Git commented 3 years ago

I think with change data capture, there's no need for a outbox collection, but anyway..

A new output simply works, but it seems too specialized. The listed fields determine two different aspects:

I think a new router and enhanced encoder would work better if there're different outbox collections. For example, a router which takes a field from current message as target topic, and a field as partition key. This is general enough. For encoding, simply ignoring some fields can already handled with Filter. About the Header, current encoders could be enhanced.

Are you ok with a new PR to add kafka_outbox as a new output plugin?

If that plugin works only for these fields or like, I'm afraid not.

mehran-prs commented 3 years ago

CDC is good, but we need to outbox to implement a Transactional outbox pattern(which takes app's events).

If that plugin works only for these fields or like, I'm afraid not.

No, I'll get the outbox config, so the user can customize his own outbox collection.

Ok, so I'll check it and add more details about the implementation here, what do you think?

Ryan-Git commented 3 years ago

I wonder if there's any standard for the structure of outbox collection?

Ok, so I'll check it and add more details about the implementation here, what do you think?

I've got a new idea about encoder. The kafka header thing should be configured inside kafka output plugin rather than encoder. So a new simple router and header support within existing kafka output seems more promising. What do you think?

mehran-prs commented 3 years ago

I wonder if there's any standard for the structure of outbox collection?

It does not have a standard, but all implementations are like each other: a collection(or table in other DBs) which contains all fields which we need to send to the Kafka(topic,key,value,headers), example of its implementation

So a new simple router and header support within existing Kafka output seems more promising

Yes, what I got is:

I'm also thinking about which one can be better for an outbox pattern, extending gravity or extending mongo-kafka connector.

Ryan-Git commented 3 years ago

Need a new encoder which just returns a field's value as the event's payload(e.g, doc.value).

yes if the consumer needed(don't want another pipeline to reshape data...)

Need a new router that specifies topic based on the document's field(e.g., doc.topic)

yes.

Need a new output to add headers.

No. I think modify the current kafka output is ok. Just add the ability to add headers from configured fields.

So now we have several possible solutions

  1. Combine all outbox related config into a NEW (kafka) output plugin
  2. Combine all outbox related config into the EXISTING kafka output plugin
  3. Separate routing and encoding into different components

1 is the initial proposal. By the example provided, I'm also OK with this, but still prefer 3 since it's more general and not much more work. @ming535 any comments?