Closed ferozed closed 4 years ago
The connect image just runs connect-distributed
, not Schema Registry and/or Control Center.
That column you're reading is "Packages Included", meaning that the Control Center interceptors are installed for monitoring and schema-registry is included for the Avro converters.
Need an image that runs connect-standalone to store offset data locally for a JDBC sink connector. Seeing some unofficial packages which doesn't have much documentation on them. Can there be an official one included in confluentinc hub?
@nickvgils
Data isn't stored locally anyway,though.
The JDBC sink connector always reads from a remote Kafka topic and sends the data to a remote database. If by "locally", you mean it's running on the host machine, then you need to make the adjustments to the configurations to use the host address rather than localhost. And just because it's "distributed mode" doesn't mean that it needs to be distributed over multiple instances - even standalone mode could share the same consumer group.
The Confluent Hub isn't for Docker images, just connector plugins
@cricket007
Hmm, thanks for your response. Let me tell you my use case:
Problem: storing from Kafka topic to a database (sqlite) on windows requires confluent connect, which is Unix only. Solution is probably using Docker. If i run the cp-kafka-connect image, it is always starting in distributed mode. This mode demands to have storage like offset, config and status to be stored inside a topic on the remote Ubuntu server. I want this data to be locally to limit the number of topics created (which is possible with running Kafka connect standalone).
The Confluent Hub isn't for Docker images, just connector plugins
You're right, misstated this one. What i meant was that confluent should add a confluent connect standalone image to their docker images, to store storage locally.
@nickvgils
For clarification, it's "Kafka Connect", and not proprietary to Confluent.
For sink connectors, offsets are stored back in the broker, even when using standalone mode. This is because it's a regular consumer group under the hood.
I'm not sure I see a need to store config or status locally to a container, so that would just leave the sqlite database, which I hope you're volume mapping out of the container so that you can access it otherwise.
That all being said, no, Kafka Connect is not unix specific - both standalone and distributed have windows scripts : https://github.com/apache/kafka/blob/trunk/bin/windows/connect-standalone.bat
@cricket007
Thank you for this comment, totally cleared my head. Misunderstood the definition of the "locally storing". After doing some more reading i finally got it. You're right about storing the offsets. Was thinking about running in docker but with the Kafka Connect bat file for Windows i don't need Docker at all, just as you said. Just ran Kafka connect on Windows with an additional jdbc plugin, sqlite-jdbc and avro converter jar files and it works like a charm. All data from a topic gets stored inside sqlite.
Does the connector image deploy a JVM for Kafka itself as well? When I check the running processes there is one named Kafka.
cp-kafka-connect
only runs ConnectDistribued JVM process
👋 @ferozed
Have your concerns been addressed here?
I had the same question as @ferozed. Thank you @OneCricketeer that's crysta clear. 👏
You can find an image on my profile, btw
@OneCricketeer Yeah my concerns are addressed.
This is indeed a standalone image. Whether it runs in standalone mode or distributed mode depends on how you configure it.
@OneCricketeer
Please let me know where can I ask questions related to this thread, in case this is not the appropriate place.
I using kakfa where I can not create any additional topic
3) This is open source Kafka Connect. As answered, it runs distributed mode, which 1-2) requires a broker to interact with, and requires 3 internal topics as well as extra topics to sink/source
The cp-kafka-connect image ( https://hub.docker.com/r/confluentinc/cp-kafka-connect ) comes with schema registry and control center, as per the documentation on ( https://docs.confluent.io/current/installation/docker/image-reference.html ).
However, we already have schema registry and control center running on a different machine. So we dont need the
cp-kafka-connect
image to start these up again.Please tell me how to run standalone
kafka-connect
service. Or... please publish a standalonekafka-connect
docker image.