hpgrahsl / kafka-connect-mongodb

**Unofficial / Community** Kafka Connect MongoDB Sink Connector -> integrated 2019 into the official MongoDB Kafka Connector here: https://www.mongodb.com/kafka-connector
Apache License 2.0
153 stars 61 forks source link

cannot see kafka topic in mongodb. question about the connection issue #42

Closed gjiang1 closed 6 years ago

gjiang1 commented 6 years ago

Hello, Thank you for your article and help! I followed your instruction to set up the connection between Kafka and MongoDB (sink), After I run ./bin/connect-standalone ./etc/schema-registry/connect-avro-standalone.properties ./etc/sink-connect-mongodb.properties:", I run curl......, I can get the correct response," {"offsets":[{"partition":0,"offset":0,"error_code":null,"error":null}],"key_schema_id":null,"value_schema_id":21}" and see "avrotest in kafka topic, however I got error below and cannot find “avrotest” collection in mongo: ...... [2018-06-25 15:03:25,419] INFO 0 have been processed (org.radarcns.connect.mongodb.MongoDbSinkTask:57) [2018-06-25 15:03:25,420] ERROR WorkerSinkTask{id=kafka-connector-mongodb-sink-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:172) java.lang.NoClassDefFoundError: com/mongodb/MongoException at org.radarcns.connect.mongodb.MongoDbSinkTask.createMongoDbWriter(MongoDbSinkTask.java:114) at org.radarcns.connect.mongodb.MongoDbSinkTask.start(MongoDbSinkTask.java:95) at org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:281) at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:170) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: com.mongodb.MongoException at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 11 more [2018-06-25 15:03:25,421] ERROR WorkerSinkTask{id=kafka-connector-mongodb-sink-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:173) [2018-06-25 15:03:25,421] INFO Stopping MongoDBSinkTask (org.radarcns.connect.mongodb.MongoDbSinkTask:163) [2018-06-25 15:03:25,421] INFO Stopped MongoDBSinkTask (org.radarcns.connect.mongodb.MongoDbSinkTask:185).

I know the connection between Kafka and MongoDB is failed, but cannot figure out why. Can you help to give some suggestions? Do I need to set up plugin, configuration, workers.....specifically before running the connect-standalone? what parts did I miss?

Thank you very much for your time and help! my email is gjiang@psi-it.com gw

gjiang1 commented 6 years ago

I started zookeeper, kafka server, kafka rest and schema registry one by one and looks like they are up and running, but when I check the status, all show down now. It confused me, should I start them one by one or run "/confluent-4.1.1/bin/confluent start" to make all 6 together?

/confluent-4.1.1/bin/confluent status

ksql-server is [DOWN] connect is [DOWN] kafka-rest is [DOWN] schema-registry is [DOWN] kafka is [DOWN] zookeeper is [DOWN]

But actually, they are running.

Thanks for your help!

hpgrahsl commented 6 years ago

Hi @gjiang1 !

First I would like to ask which article you are referring to and which instructions you mean that you are following :)

Apart from that, what's a bit confusing to me is the fact that the log snippet you posted contains a completely different package declaration, namely, org.radarcns.connect.mongodb. Can it be that you are referring to a different mongodb sink connector e.g. this one https://github.com/RADAR-base/MongoDb-Sink-Connector ? Taking a closer look at the log messages you can see that the following file is what you are potentially running: https://github.com/RADAR-base/MongoDb-Sink-Connector/blob/master/src/main/java/org/radarcns/connect/mongodb/MongoDbSinkTask.java

Meanwhile I'm waiting for your hopefully clarifying response. THX in advance!

gjiang1 commented 6 years ago

Hi Hpgrahsl, thank you very much for your responses.

I followed https://github.com/RADAR-base/MongoDb-Sink-Connector. I tried to use two ways to start kafka, one is " /confluent-4.1.1/bin/confluent start", you can see below, all UP Using CONFLUENT_CURRENT: /tmp/confluent.NFBX0ijJ Starting zookeeper zookeeper is [UP] Starting kafka kafka is [UP] Starting schema-registry schema-registry is [UP] Starting kafka-rest kafka-rest is [UP] Starting connect connect is [UP] Starting ksql-server ksql-server is [UP] Then, I started to run ./bin/connect-standalone ./etc/schema-registry/connect-avro-standalone.properties ./etc/sink-connect-mongodb.properties, however, I got errors. Please see attachment. I checked Confluent status, can see above 6 stuff are still UP kafka connector error file1.docx

I killed all Confluent sessions and started zookeeper, kafka server, schema registry and kafka rest one by one following the instruction above, for example, ./bin/zookeeper-server-start ./etc/kafka/zookeeper.properties, I keep the terminals opening and run "./bin/connect-standalone ./etc/schema-registry/connect-avro-standalone.properties ./etc/sink-connect-mongodb.properties" and "curl -X POST -H "Content-Type: application/vnd.kafka.avro.v2+json" \ -H "Accept: application/vnd.kafka.v2+json" \ --data '{"value_schema": "{\"type\": \"record\", \"name\": \"User\", \"fields\": [{\"name\": \"name\", \"type\": \"string\"}]}", "records": [{"value": {"name": "testUser"}}]}' \ "http://localhost:8082/topics/avrotest ", but I can not see avotest collection in MongoDB.

Attached please see connect-avro-standalone.properties and sink-connect-mongodb.properties files. mongodb, confluent, kafka installation and setup.docx mongodb installation in linux environment.docx

I am not sure which part I am missing or setting wrong. I will follow you suggestions to do again tomorrow. If you find something wrong in my files, please let me know.

Thank you so much for your time and help!

gw

gjiang1 commented 6 years ago

Hi Hpgrahsl,

This the message I posted to you yesterday, I will follow your suggestions to run these processes again. Very appreciate your help!

"Hi Hpgrahsl, thank you very much for your responses. I followed https://github.com/RADAR-base/MongoDb-Sink-Connector. I tried to use two ways to start kafka, one is " /confluent-4.1.1/bin/confluent start", you can see below, all UP Using CONFLUENT_CURRENT: /tmp/confluent.NFBX0ijJ Starting zookeeper zookeeper is [UP] Starting kafka kafka is [UP] Starting schema-registry schema-registry is [UP] Starting kafka-rest kafka-rest is [UP] Starting connect connect is [UP] Starting ksql-server ksql-server is [UP] Then, I started to run ./bin/connect-standalone ./etc/schema-registry/connect-avro-standalone.properties ./etc/sink-connect-mongodb.properties, however, I got errors. Please see attachment. I checked Confluent status, can see above 6 stuff are still UP kafka connector error file1.docx

I killed all Confluent sessions and started zookeeper, kafka server, schema registry and kafka rest one by one following the instruction above, for example, ./bin/zookeeper-server-start ./etc/kafka/zookeeper.properties, I keep the terminals opening and run "./bin/connect-standalone ./etc/schema-registry/connect-avro-standalone.properties ./etc/sink-connect-mongodb.properties" and "curl -X POST -H "Content-Type: application/vnd.kafka.avro.v2+json" -H "Accept: application/vnd.kafka.v2+json" --data '{"value_schema": "{"type": "record", "name": "User", "fields": [{"name": "name", "type": "string"}]}", "records": [{"value": {"name": "testUser"}}]}' "http://localhost:8082/topics/avrotest ", but I can not see avotest collection in MongoDB.

Attached please see connect-avro-standalone.properties and sink-connect-mongodb.properties files. mongodb, confluent, kafka installation and setup.docx mongodb installation in linux environment.docx

I am not sure which part I am missing or setting wrong. I will follow you suggestions to do again tomorrow. If you find something wrong in my files, please let me know.

Thank you so much for your time and help!

gw"


From: Hans-Peter Grahsl [notifications@github.com] Sent: Tuesday, June 26, 2018 3:14 PM To: hpgrahsl/kafka-connect-mongodb Cc: Gongwei (George) Jiang; Mention Subject: Re: [hpgrahsl/kafka-connect-mongodb] cannot see kafka topic in mongodb. question about the connection issue (#42)

Hi @gjiang1https://github.com/gjiang1 !

First I would like to ask which article you are referring to and which instructions you mean that you are following :)

Apart from that, what's a bit confusing to me is the fact that the log snippet you posted contains a completely different package declaration, namely, org.radarcns.connect.mongodb. Can it be that you are referring to a different mongodb sink connector e.g. this one https://github.com/RADAR-base/MongoDb-Sink-Connector ? Taking a closer look to the log messages you can see that the following file is what you are running: https://github.com/RADAR-base/MongoDb-Sink-Connector/blob/master/src/main/java/org/radarcns/connect/mongodb/MongoDbSinkTask.java

Meanwhile I'm waiting for you hopefully clarifying response. THX in advance!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/hpgrahsl/kafka-connect-mongodb/issues/42#issuecomment-400430369, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AmLxqvjWPN-mCtDVy87zY_Je1ebwwMoRks5uAogagaJpZM4U2mm5.

gjiang1 commented 6 years ago

I will show you what I did step by step last time, then please help me to find what is the problem I had. Thanks a lot! gw

gjiang1 commented 6 years ago

Hello Hpgrahsl,

Below are what I did, please help me to figure out the problem. I am a database person but not a programmer and JAVA person. It is my first time to hear Kafka and Confluent, and is learning how to set up the connection between source DB <> Confluent , Kafka <> Sink DB for the future potential project. Very appreciate your help!

  1. I did not do the Docker installation using “ radarbase/kafka-connect-mongodb-sink” Docker image since I have had a Docker container in EC2 Linux environment and MongDB installed there. I installed Confluent-4.1.1 and Confluent JDBC Connector ( to be used for source Oracle database connection).

  2. I downloaded kafka-connect-mongodb-sink-0.2.2-javadoc.jar from https://github.com/RADAR-base/MongoDb-Sink-Connector/releases and put this jar file to /confluent-4.1.1/share/java/

  3. export CLASSPATH=/confluent-4.1.1/share/java/kafka-connect-mongodb-sink-0.2.2.jar

  4. Start Confluent :

  5. [root@7fa5e286b664 bin]# /confluent-4.1.1/bin/confluent start Result: Using CONFLUENT_CURRENT: /tmp/confluent.NFBX0ijJ Starting zookeeper zookeeper is [UP] Starting kafka kafka is [UP] Starting schema-registry schema-registry is [UP] Starting kafka-rest kafka-rest is [UP] Starting connect connect is [UP] Starting ksql-server ksql-server is [UP]

  6. Create “sink-connect-mongodb.properties” file in /confluent-4.1.1/etc:

    Kafka consumer configuration

    name=kafka-connector-mongodb-sink

    Kafka connector configuration

    connector.class=org.radarcns.connect.mongodb.MongoDbSinkConnector tasks.max=1

    Topics that will be consumed

    topics=avrotest

    MongoDB server

    mongo.host=localhost mongo.port=27017

    MongoDB configuration

    mongo.username= mongo.password= mongo.database=mydb

    Collection name for putting data into the MongoDB database. The {$topic} token will be replaced

    by the Kafka topic name.

    mongo.collection.format={$topic}

    Factory class to do the actual record conversion

    record.converter.class=org.radarcns.connect.mongodb.serialization.RecordConverterFactory

  7. Modify connect-avro-standalone.properties file in /confluent-4.1.1/etc/schema-registry/ for plugin.path: bootstrap.servers=localhost:9092 key.converter.schema.registry.url=http://localhost:8081 value.converter=io.confluent.connect.avro.AvroConverter value.converter.schema.registry.url=http://localhost:8081 internal.key.converter=org.apache.kafka.connect.json.JsonConverter internal.value.converter=org.apache.kafka.connect.json.JsonConverter internal.key.converter.schemas.enable=false internal.value.converter.schemas.enable=false offset.storage.file.filename=/tmp/connect.offsets

    plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,

    Replace the relative path below with an absolute path if you are planning to start Kafka Connect from within a

    directory other than the home directory of Confluent Platform.

    plugin.path=/confluent-4.1.1/share/java/

  8. Run connector: [root@7fa5e286b664 confluent-4.1.1]# ./bin/connect-standalone ./etc/schema-registry/connect-avro-standalone.properties ./etc/sink-connect-mongodb.properties

But the process stopped because of the errors in the log file that I sent to you.

  1. If I manually started zookeeper, kafka-server, kafka-rest and schema-registry separately from different terminals, I can see they are running but with the errors.

  2. [root@7fa5e286b664 confluent-4.1.1]# curl -X POST -H "Content-Type: application/vnd.kafka.avro.v2+json" \ -H "Accept: application/vnd.kafka.v2+json" \ --data '{"value_schema": "{\"type\": \"record\", \"name\": \"User\", \"fields\": [{\"name\": \"name\", \"type\": \"string\"}]}", "records": [{"value": {"name": "testUser"}}]}' \ "http://localhost:8082/topics/avrotest " I can see the following response (that is correct): {"offsets":[{"partition":0,"offset":0,"error_code":null,"error":null}],"key_schema_id":null,"value_schema_id":21}

  3. Check kafka topic, [root@7fa5e286b664 bin]# ./kafka-topics --list --zookeeper localhost:2181 confluent.support.metrics consumer_offsets _schemas test-oracle-jdbc-USERS avrotest

  4. Open Mongo and check mongodb db, but can not see “avrotest” collection in mongo:

    show collections customers month month1 people students users

  5. When I tried to set up source JDBC –connector, I got some errors too.

Your saw your MongoDbSinkTask.java file, how and where can I use it? Did I set plugin.path correct? Is the anything I am missing? I did not use Docker image to set up Kafka and MongoDB, do you think it is an issue?

Thanks a lot for your time and help!

gjiang1 commented 6 years ago

Hello Hpgrahsl,

Below are what I did, please help me to figure out the problem. I am a database person but not a programmer and JAVA person. It is my first time to hear Kafka and Confluent, and is learning how to set up the connection between source DB <> Confluent , Kafka <> Sink DB for the future potential project. Very appreciate your help!

  1. I did not do the Docker installation using “ radarbase/kafka-connect-mongodb-sinkhttps://hub.docker.com/r/radarbase/kafka-connect-mongodb-sink” Docker image since I have had a Docker container in EC2 Linux environment and MongDB installed there. I installed Confluent-4.1.1 and Confluent JDBC Connector ( to be used for source Oracle database connection).

  2. I downloaded kafka-connect-mongodb-sink-0.2.2-javadoc.jarhttps://github.com/RADAR-base/MongoDb-Sink-Connector/releases/download/v0.2.2/kafka-connect-mongodb-sink-0.2.2-javadoc.jar from https://github.com/RADAR-base/MongoDb-Sink-Connector/releases and put this jar file to /confluent-4.1.1/share/java/

  3. export CLASSPATH=/confluent-4.1.1/share/java/kafka-connect-mongodb-sink-0.2.2.jar

  4. Start Confluent :

  5. [root@7fa5e286b664 bin]# /confluent-4.1.1/bin/confluent start

Result:

Using CONFLUENT_CURRENT: /tmp/confluent.NFBX0ijJ

Starting zookeeper

zookeeper is [UP]

Starting kafka

kafka is [UP]

Starting schema-registry

schema-registry is [UP]

Starting kafka-rest

kafka-rest is [UP]

Starting connect

connect is [UP]

Starting ksql-server

ksql-server is [UP]

  1. Create “sink-connect-mongodb.properties” file in /confluent-4.1.1/etc:

name=kafka-connector-mongodb-sink

connector.class=org.radarcns.connect.mongodb.MongoDbSinkConnector

tasks.max=1

topics=avrotest

mongo.host=localhost

mongo.port=27017

mongo.username=

mongo.password=

mongo.database=mydb

  1. Modify connect-avro-standalone.properties file in /confluent-4.1.1/etc/schema-registry/ for plugin.path:

bootstrap.servers=localhost:9092

key.converter.schema.registry.url=http://localhost:8081

value.converter=io.confluent.connect.avro.AvroConverter

value.converter.schema.registry.url=http://localhost:8081

internal.key.converter=org.apache.kafka.connect.json.JsonConverter

internal.value.converter=org.apache.kafka.connect.json.JsonConverter

internal.key.converter.schemas.enable=false

internal.value.converter.schemas.enable=false

offset.storage.file.filename=/tmp/connect.offsets

plugin.path=/confluent-4.1.1/share/java/

  1. Run connector:

[root@7fa5e286b664 confluent-4.1.1]# ./bin/connect-standalone ./etc/schema-registry/connect-avro-standalone.properties ./etc/sink-connect-mongodb.properties

But the process stopped because of the errors in the log file that I sent to you.

  1. If I manually started zookeeper, kafka-server, kafka-rest and schema-registry separately from different terminals, I can see they are running but with the errors.

  2. [root@7fa5e286b664 confluent-4.1.1]# curl -X POST -H "Content-Type: application/vnd.kafka.avro.v2+json" \

    -H "Accept: application/vnd.kafka.v2+json" \

    --data '{"value_schema": "{\"type\": \"record\", \"name\": \"User\", \"fields\": [{\"name\": \"name\", \"type\": \"string\"}]}", "records": [{"value": {"name": "testUser"}}]}' \

"http://localhost:8082/topics/avrotest "

I can see the following response (that is correct):

{"offsets":[{"partition":0,"offset":0,"error_code":null,"error":null}],"key_schema_id":null,"value_schema_id":21}

  1. Check kafka topic,

[root@7fa5e286b664 bin]# ./kafka-topics --list --zookeeper localhost:2181

__confluent.support.metrics

__consumer_offsets

_schemas

test-oracle-jdbc-USERS

avrotest

  1. Open Mongo and check mongodb db, but can not see “avrotest” collection in mongo:

show collections

customers

month

month1

people

students

users

  1. When I tried to set up source JDBC –connector, I got some errors too.

Your mentioned MongoDbSinkTask.java file, how and where can I use it? Did I set plugin.path correct? Is the anything I am missing?

Thanks a lot for your time and help!


From: Hans-Peter Grahsl [notifications@github.com] Sent: Tuesday, June 26, 2018 3:14 PM To: hpgrahsl/kafka-connect-mongodb Cc: Gongwei (George) Jiang; Mention Subject: Re: [hpgrahsl/kafka-connect-mongodb] cannot see kafka topic in mongodb. question about the connection issue (#42)

Hi @gjiang1https://github.com/gjiang1 !

First I would like to ask which article you are referring to and which instructions you mean that you are following :)

Apart from that, what's a bit confusing to me is the fact that the log snippet you posted contains a completely different package declaration, namely, org.radarcns.connect.mongodb. Can it be that you are referring to a different mongodb sink connector e.g. this one https://github.com/RADAR-base/MongoDb-Sink-Connector ? Taking a closer look to the log messages you can see that the following file is what you are running: https://github.com/RADAR-base/MongoDb-Sink-Connector/blob/master/src/main/java/org/radarcns/connect/mongodb/MongoDbSinkTask.java

Meanwhile I'm waiting for you hopefully clarifying response. THX in advance!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/hpgrahsl/kafka-connect-mongodb/issues/42#issuecomment-400430369, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AmLxqvjWPN-mCtDVy87zY_Je1ebwwMoRks5uAogagaJpZM4U2mm5.

gjiang1 commented 6 years ago

I modified plugin.path in "connect-avro-standalone.properties" : plugin.path=/etc/alternatives, /var/lib/alternatives, /usr/share/java, /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.x86_64/jre/bin, /usr/bin, /etc/pki/java, /etc/java, /etc/pki/ca-trust/extracted/java, /usr/lib then started zookeeper, kafka-server, kafka-rest and schema-registry one by one, they are still running, I run MongoDB-Sink-Connector again,

./bin/connect-standalone ./etc/schema-registry/connect-avro-standalone.properties ./etc/sink-connect-mongodb.properties

it is still running, but i can see the error:

[2018-06-27 16:15:08,196] ERROR WorkerSinkTask{id=sink-connect-mongodb-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:173) [2018-06-27 16:15:08,203] INFO Stopping MongoDBSinkTask (org.radarcns.connect.mongodb.MongoDbSinkTask:163) [2018-06-27 16:15:08,203] INFO Stopped MongoDBSinkTask (org.radarcns.connect.mongodb.MongoDbSinkTask:185)

[2018-06-27 16:15:08,196] ERROR WorkerSinkTask{id=sink-connect-mongodb-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:173) [2018-06-27 16:15:08,203] INFO Stopping MongoDBSinkTask (org.radarcns.connect.mongodb.MongoDbSinkTask:163) [2018-06-27 16:15:08,203] INFO Stopped MongoDBSinkTask (org.radarcns.connect.mongodb.MongoDbSinkTask:185)

not sure what's wrong.

Thanks for your help!

hpgrahsl commented 6 years ago

Hi again @gjiang1!

Thx for all your detailed comments. However as you pointed out many times you are not using my sink connector project but a different one from https://github.com/RADAR-base namely this project https://github.com/RADAR-base/MongoDb-Sink-Connector

So I would highly recommend you post your "issue" - which actually is more about questions how to setup and get their project running - in their project instead of mine.

I will thus close your issue since it is not related to my project. Wishing you all the best to get the other project up & running. If you can't you are welcome to give my project a try :)

Greetings!

gjiang1 commented 6 years ago

Ok, thanks.