confluentinc / kafka-images

Confluent Docker images for Apache Kafka
Apache License 2.0
18 stars 136 forks source link

cp-kafka-connect fails inconsistently due to multiple loggers in classpath #142

Open oscarjohansson94 opened 2 years ago

oscarjohansson94 commented 2 years ago

Hi!

This refers to cp-kafka-connect:7.0.1.

When running cp-kafka-connect docker image we stumble upon inconsistent behavior of SLF4J. The default CUB_CLASSPATH is set to /usr/share/java/cp-base-new/ where multiple loggers exists:

On one machine we consistently get the error:

===> Check if Kafka is healthy ...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/share/java/cp-base-new/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/share/java/cp-base-new/slf4j-simple-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger (io.confluent.admin.utils.cli.KafkaReadyCommand).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

So it finds multiple loggers (which are included in the image cp-kafka-connect), chooses Log4jLoggerFactory and proceeds to crash. The documentation provided (http://www.slf4j.org/codes.html#multiple_bindings) states that there should only be one logger defined in the class path.

However, on another machine we consistently get the output:

===> Check if Kafka is healthy ...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/share/java/cp-base-new/slf4j-simple-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/share/java/cp-base-new/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory]

In this case it happens to choose SimpleLoggerFactory instead of Log4jLoggerFactory, and proceeds happily.

Workaround Delete /usr/share/java/cp-base-new/slf4j-log4j12-1.7.30.jar

It seems like /usr/share/java/cp-base-new/slf4j-log4j12-1.7.30.jar should not exist in the classpath

stondini commented 2 years ago

Hi, Does this issue prevents cp-kafka-connect to start properly ? The container stuck in health check: d495df064e84 confluentinc/cp-kafka-connect:7.0.1 "/etc/confluent/dock…" 9 minutes ago Up 9 minutes (health: starting)

Then it restarts every 10 minutes.

oscarjohansson94 commented 2 years ago

@stondini If org.slf4j.impl.Log4jLoggerFactory gets chosen then cp-kafka-connect will not start properly. But I think you need to provide more information before we can determine if you experience the same issue as me. Could you please post the logging you get when this happens?

stondini commented 2 years ago

@oscarjohansson94 I get the same logs as you. I downgraded Connect to confluentinc/cp-kafka-connect:6.1.4 which works well.

oscarjohansson94 commented 2 years ago

@stondini If you want to use 7.0.1 you can add this to your docker file to remove one of the loggers.

FROM confluentinc/cp-kafka-connect:7.0.1
RUN rm /usr/share/java/cp-base-new/slf4j-log4j12-*.jar
stondini commented 2 years ago

I'm using Docker Compose, so I modified compose file in order to execute the rm command:

 command:
      - bash
      - -c
      - |
        rm /usr/share/java/cp-base-new/slf4j-log4j12-*.jar
        echo "Launching Kafka Connect"
        /etc/confluent/docker/run &
        sleep infinity

Content of the folderr /usr/share/java/cp-base-new without rm command:

-rw-r--r-- 1 appuser appuser   41472 Jan 11 22:01 slf4j-api-1.7.30.jar
-rw-r--r-- 1 appuser appuser   12211 Jan 11 22:06 slf4j-log4j12-1.7.30.jar
-rw-r--r-- 1 appuser appuser   15239 Jan 11 22:06 slf4j-simple-1.7.30.jar

Content of the folderr /usr/share/java/cp-base-new with rm command:

-rw-r--r-- 1 appuser appuser   41472 Jan 11 22:01 slf4j-api-1.7.30.jar
-rw-r--r-- 1 appuser appuser   15239 Jan 11 22:06 slf4j-simple-1.7.30.jar

But nothing changed. Connect restarts every 10min.

oscarjohansson94 commented 2 years ago

@stondini You are removing the file using docker command, which is executed in run time. There might be a case where:

  1. The image starts and have two loggers, it fails initialization
  2. Your command is executed and removing the logger after the init failed
  3. You check your image and only see one logger.

I suggested that you copy my example dockerfile and then build it using docker compose and remove your command.

donatelloOo commented 2 years ago

Besides the fact that several implementations of SLF4J in classpath do not give confidence as to the quality of this version, these are only warnings...

bimguess commented 1 year ago

As a workaround, you can use this in your kuberenetes deployment:

      volumeMounts:
        - name: slf4j-exclude
          mountPath: /exclude
          readOnly: true
  volumes:
    - name: slf4j-exclude
      emptyDir: {}
ZisisFl commented 3 months ago

Moving to version 7.0.14 also solved the issue