hpgrahsl / kafka-connect-mongodb

**Unofficial / Community** Kafka Connect MongoDB Sink Connector -> integrated 2019 into the official MongoDB Kafka Connector here: https://www.mongodb.com/kafka-connector
Apache License 2.0
153 stars 60 forks source link

Write model strategy - optional upsert #142

Closed PushUpek closed 3 months ago

PushUpek commented 5 months ago

I don't see any option that disable upsert in write model strategy. If I good see in source code this option is hardcoded. Is there any plans for make upsert optional?

hpgrahsl commented 5 months ago

THX for your question @PushUpek! As also mentioned in the documentation this is the standard behaviour regarding write models:

The default behaviour for the connector whenever documents are written to MongoDB collections is to make use of a proper ReplaceOneModel with upsert mode and create the filter document based on the _id field which results from applying the configured DocumentIdAdder in the value structure of the sink document.

May I ask what you would like to achieve and how you'd want this behaviour to be different? There is in fact already some flexibility when you look at the options you have according to the PostProcessor chain settings the DocumentIdAdder as well as more customized write models, all of which are briefly described in the README.

PushUpek commented 5 months ago

I just want to update existing documents in collection without adding new one. I don't see option to do this with postprocessing.

hpgrahsl commented 5 months ago

So what you'd like to have is that when a kafka record is read from a topic you want to only apply an update e.g. based on the _id field? If so I'd say this is a rather special use case and has never been requested since I launched this project many moons ago :)

There's primarily two reasons why this has not been implemented yet:

1) Imagine you have an empty mongodb collection to begin with. Only trying to update something will never work, hence the collection will never receive any data.

2) Even if you have existing documents in the collection, what would happen is that any kafka record that's read from the topic for which there is no matching mongodb document would be dropped / skipped, which again is typically not something people wanted to have in the past.

The bottom line is, if you really have this requirement you can always fork the project and add your own custom write model to it. This is the interface to implement https://github.com/hpgrahsl/kafka-connect-mongodb/blob/master/src/main/java/at/grahsl/kafka/connect/mongodb/writemodel/strategy/WriteModelStrategy.java

There are several examples in the project which you can look at to get some inspiration in this package https://github.com/hpgrahsl/kafka-connect-mongodb/tree/master/src/main/java/at/grahsl/kafka/connect/mongodb/writemodel/strategy

hpgrahsl commented 3 months ago

closing this due to PR suggested in upstream project here: https://github.com/mongodb/mongo-kafka/pull/162