startappdev / kafka-connect-mongodb

MongoDB sink connector for Kafka Connect
Apache License 2.0
13 stars 9 forks source link
kafka-connect kafka-connector mongodb-sink-connector

kafka-connect-mongodb

The MongoDB sink connector for Kafka Connect provides a simple, continuous link from a Kafka topic or set of topics to MongoDB collection or collections.

The connector consumes Kafka messages, renames message fields, selects specific fields and upserts them to the MongoDB collection.

The connector supports messages in both JSON and Avro formats, with or without a schema, and multiple topic partitions.

Connector Configurations:

Parameter Description Example
db.host host name of the database "localhost"
db.port port number of the database "27017"
db.name name of the database "myDataBase"
db.collections name of the collection "myCollection
write.batch.enabled use batch writing when a task is getting more records than write.batch.size "true"/"false"
write.batch.size the batch size when using batch writing "200"
connect.use_schema true if the data in the topic contains schema "true"/"false"
record.fields.rename rename fields name from the data in the topic "field1=>newField1, field2=>newField2"
record.keys keys in the db to update by "key1,key2"
record.fields specific fields from the record to insert the db "field1,field2"
record.timestamp.name specific fields from the record to insert the db "updateDate"

Quick Start

In the following example we will produce json data to a Kafka topic without schema, and insert it to a test collection in our MongoDB database with the connector in distributed mode.

Pre start

Start Kafka

Start Kafka Connect worker

Register the MongoDB connector:

Check it out

Using upsert, modifying field names and inserting only specific fields

Add InsertedTime field to the record in the DB