Auto-load connectors from directory

OneCricketeer commented 6 years ago

Related to confluentinc/cp-docker-images#460

Should add some directory in kafka-connect-base that loads .json or .properties files on start.

See MySQL container for inspiration

OneCricketeer commented 5 years ago

One alternative, as shown by @rmoff

In the compose, override the container command

  volumes:
    - $PWD/scripts:/scripts  # TODO: Create this folder ahead of time, on your host
  command: 
    - bash 
    - -c 
    - |
      /etc/confluent/docker/run & 
      echo "Waiting for Kafka Connect to start listening on kafka-connect ⏳"
      while [ $$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors) -eq 000 ] ; do 
        echo -e $$(date) " Kafka Connect listener HTTP state: " $$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors) " (waiting for 200)"
        sleep 5 
      done
      nc -vz kafka-connect 8083
      echo -e "\n--\n+> Creating Kafka Connector(s)"
      /scripts/create-connectors.sh  # Note: This script is stored externally from container
      sleep infinity

nwinkler commented 5 years ago

Thanks - this was helpful! I prefer a slightly optimized version:

 volumes:
    - $PWD/scripts:/scripts  # TODO: Create this folder ahead of time, on your host
  command: 
    - bash 
    - -c 
    - |
      /etc/confluent/docker/run & 
      echo "Waiting for Kafka Connect to start listening on kafka-connect ⏳"
      while : ; do
        curl_status=$$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors)
        echo -e $$(date) " Kafka Connect listener HTTP state: " $$curl_status " (waiting for 200)"
        if [ $$curl_status -eq 200 ] ; then
          break
        fi
        sleep 5 
      done
      echo -e "\n--\n+> Creating Kafka Connector(s)"
      /scripts/create-connectors.sh  # Note: This script is stored externally from container
      sleep infinity

Changes over the above version:

Only run curl once per iteration - no need to call it a second time just for printing the status. Also only one place to change the hostname and port.
Check for status 200 instead of a status different than 000. I've had Connect return a 404 status during startup (since the HTTP port was up, but the endpoint was not deployed yet), in which case the create-connectors.sh script failed. Waiting for 200 ensures that the connectors endpoint is available.
Removed the netcat call - not sure what that was needed for. Looks like a leftover from a previous version of the script...

FWIW, I run this from a separate service in my Docker Compose file - the image I use is appropriate/curl:latest. That way, you don't have to start the run command in the background...

rmoff commented 5 years ago

@nwinkler nice tips, thanks for sharing!

Matesanz commented 3 years ago

:bulb: If you want to run this command from a sh file do:

echo "Waiting for Kafka Connect to start listening on kafka-connect ⏳"
while true
do
    curl_status="$(curl -s -o /dev/null -w '%{http_code}' 'http://kafka-connect:8083/connectors')"
    if [ $curl_status -eq 200 ]
    then
        break
    fi
    echo -e "$(date)" " Kafka Connect listener HTTP state: " $curl_status " (waiting for 200)"
    sleep 5 
done
/scripts/create-connectors.sh  # Note: This script is stored externally from container
sleep infinity

OneCricketeer commented 3 years ago

Thanks, @Matesanz

That looks like the same I already posted https://github.com/confluentinc/cp-docker-images/issues/467#issuecomment-461104319

Matesanz commented 3 years ago

Thanks, @Matesanz

That looks like the same I already posted #467 (comment)

Yes, I know. But its not the same, trying to run that code from an sh file will result in error:

line 21: syntax error near unexpected token `('
line 21: `    curl_status =$$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors)

OneCricketeer commented 3 years ago

You still need to modify the container command execution to run different files, though

This issue was opened to not need that

Matesanz commented 3 years ago

It's still useful for those looking on how to auto load a connector.

Also official tutorials on this topic use an external file (https://github.com/mitch-seymour/mastering-kafka-streams-and-ksqldb/blob/master/chapter-09/files/ksqldb-server/run.sh). And somehow they could end up here (as I did).

Matesanz commented 3 years ago

I also found really useful your response on this topic here

Other solution would be start connect-distributed once, anywhere, configure the internal topics, post a connector (which saves to config topic), then start N containers, and they all pick up the same config

Its not easy to figure out how to configure properly a connector using connect-standalone and connect-distributed scripts

OneCricketeer commented 3 years ago

It's not possible to use standalone because the connectors wouldn't be persistent. The Connect container doesn't need to change; it already runs connect distributed script

My suggestion was to use at least two different containers, one that starts Connect server alone with all needed plugins, then another that iterates over a list of mounted JSON files and posts them all, using file name as connector name, for example.

If that init container happens to restart, then connector with those names already exists, so no harm done

OneCricketeer commented 3 years ago

official tutorials on this topic use an external file

I don't see where that script is used in the compose file

Matesanz commented 3 years ago

official tutorials on this topic use an external file

I don't see where that script is used in the compose file

here: https://github.com/mitch-seymour/mastering-kafka-streams-and-ksqldb/blob/924bc71b394baf3284c21dedc498b8f5e98898b9/chapter-09/docker-compose.yml#L37

OneCricketeer commented 3 years ago

Ah. My bad, was looking for a Connect container rather than ksqlDB

Matesanz commented 3 years ago

It's not possible to use standalone because the connectors wouldn't be persistent. The Connect container doesn't need to change; it already runs connect distributed script

Sorry for my ignorance but why wouldn't the connectors persist in Standalone?

OneCricketeer commented 3 years ago

The Connect containers are ephemeral. Connect in standalone mode isn't configured with the three internal topics to store configs, statuses or (source) offsets.

https://docs.confluent.io/platform/current/connect/concepts.html

confluentinc / cp-docker-images

Auto-load connectors from directory #467