confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
128 stars 1.04k forks source link

Using slf4j in custom UDF function doesn't log to anything #5315

Open keweishang opened 4 years ago

keweishang commented 4 years ago

Describe the bug Using slf4j to log in a UDF function doesn't log anything because the org.apache.log4j.Appender and org.apache.log4j.DailyRollingFileAppender are loaded by 2 different classloaders - io.confluent.ksql.function.UdfClassLoader and sun.misc.Launcher$AppClassLoader respectively.

To Reproduce Steps to reproduce the behavior, include:

  1. The version of KSQL: KSQL Server 5.5.0 and 5.3.2. ksql-udf maven dependency 5.4.2 (latest available).

  2. Sample source data: Sending any AVRO record that has the varchar_col STRING field to the cdc_local.my_db.type_table topic would do.

  3. Any SQL statements you ran:

    
    CREATE STREAM cdc_local__my_db__type_table
    WITH (KAFKA_TOPIC='cdc_local.my_db.type_table', VALUE_FORMAT='AVRO');

CREATE STREAM redshiftmy_db__type_table WITH (KAFKA_TOPIC='redshift.my_db.type_table', VALUE_FORMAT='AVRO', PARTITIONS=1, REPLICAS=1) AS SELECT REVERSE(varchar_col) AS "reverse_varchar_col" FROM cdc_localmy_db__type_table;


4. The UDF class definition:

import io.confluent.ksql.function.udf.Udf; import io.confluent.ksql.function.udf.UdfDescription; import io.confluent.ksql.function.udf.UdfParameter; import org.slf4j.Logger; import org.slf4j.LoggerFactory;

@UdfDescription( name = "reverse", description = "Example UDF that reverses an object", version = "1.0-SNAPSHOT", author = "kewei shang" ) public class ReverseUdf {

private static Logger logger = LoggerFactory.getLogger(ReverseUdf.class);

@Udf(description = "Reverse a string") public String reverseString( @UdfParameter(value = "source", description = "the value to reverse") final String source ) { logger.info("helloworld from logger"); System.out.println("helloworld from stdout"); return new StringBuilder(source).reverse().toString(); }

}


5. KSQL server's log4j properties:

log4j.rootLogger=INFO, file

appenders

log4j.appender.file=org.apache.log4j.DailyRollingFileAppender log4j.appender.file.DatePattern='.'yyyy-MM-dd log4j.appender.file.File=${ksql.log.dir}/ksql.log log4j.appender.file.layout=org.apache.log4j.PatternLayout log4j.appender.file.layout.ConversionPattern=[%d] %p %m (%c:%L)%n log4j.appender.file.append=true

loggers

log4j.logger.org.apache.kafka.streams=INFO, file log4j.additivity.org.apache.kafka.streams=false

log4j.logger.kafka=INFO, file log4j.additivity.kafka=false

log4j.logger.org.apache.zookeeper=INFO, file log4j.additivity.org.apache.zookeeper=false

log4j.logger.org.apache.kafka=INFO, file log4j.additivity.org.apache.kafka=false

log4j.logger.org.I0Itec.zkclient=INFO, file log4j.additivity.org.I0Itec.zkclient=false

log4j.logger.processing=INFO, file log4j.additivity.processing=false


6. KSQL Server start script:

export LOG_DIR=/Users/keweishang/test/production-ksql/log export KSQL_LOG4J_OPTS=-Dlog4j.configuration=file:/Users/keweishang/test/production-ksql/ksql-log4j.properties export KSQL_OPTS=-javaagent:/Users/keweishang/test/production-ksql/jmx_prometheus_javaagent-0.12.0.jar=8104:/Users/keweishang/test/production-ksql/jmx-ksql-server-prometheus.yml /Users/keweishang/test/production-ksql/confluent-5.5.0/bin/ksql-server-start /Users/keweishang/test/production-ksql/ksql-server-local.properties


7. KSQL server properties:

ksql.service.id=ksqllocal ksql.logging.processing.topic.auto.create=true ksql.logging.processing.topic.replication.factor=1 ksql.logging.processing.topic.partitions=1 ksql.logging.processing.stream.auto.create=true ksql.logging.processing.rows.include=true bootstrap.servers=localhost:9092 ksql.schema.registry.url=http://localhost:8082 confluent.support.metrics.enable=false confluent.support.customer.id=anonymous ksql.streams.num.stream.threads=1 ksql.streams.auto.offset.reset=earliest ksql.sink.partitions=1 ksql.internal.topic.replicas=1 ksql.extension.dir=/Users/keweishang/test/production-ksql/ksql_udfs ksql.queries.file=/Users/keweishang/test/production-ksql/ksql_queries.sql


**Expected behavior**
No log4j related logging error in stdout when launching KSQL Server. The line `helloworld from logger` is logged to the `ksql.log` file.

**Actual behaviour**
1. CLI output. When starting KSQL server, I saw the following error in stdout:

log4j:ERROR The class "org.apache.log4j.Appender" was loaded by log4j:ERROR [io.confluent.ksql.function.UdfClassLoader@2819f478] whereas object of type log4j:ERROR "org.apache.log4j.DailyRollingFileAppender" was loaded by [sun.misc.Launcher$AppClassLoader@18b4aac2]. log4j:ERROR Could not instantiate appender named "file". ... log4j:WARN No appenders could be found for logger (transformation.IsNullOrEmptyTransformer). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.


2. KSQL logs:
No `helloworld from logger` is logged to the `ksql.log` file.

**Additional context**
Our KSQL Server runs in headless mode.
spena commented 4 years ago

@keweishang I tried to reproduce this but I couldn't. I am using the headless mode as you mentioned and using the 5.5.0 Server + 5.4.2 version for UDF dependency. I also used the same logger file you have.

This is the output of the log file:

[2020-05-08 12:53:06,125] INFO Server up and running (io.confluent.ksql.rest.server.KsqlServerMain:76)
[2020-05-08 12:53:06,748] INFO Successfully submitted metrics to Confluent via secure endpoint (io.confluent.support.metrics.submitters.ConfluentSubmitter:143)
[2020-05-08 14:40:52,076] INFO helloworld from logger (com.ksql.udf.ReverseUdf:24)
[2020-05-08 14:43:15,161] INFO helloworld from logger (com.ksql.udf.ReverseUdf:24)

To debug more, I added the java verbose parameter to print when classes are loaded. Can you dod that too? Here are the steps:

export KSQL_OPTS="-verbose:class"
./bin/ksql-server-start.sh ... | tee /tmp/debug

I let the KSQL start and send all the classloader list to the above file. Then I searched for the classes to see which how they were loaded:

$ grep -n DailyRollingFileAppender /tmp/debug
1362:[Loaded org.apache.log4j.DailyRollingFileAppender from file:/opt/confluent/versions/confluent-5.5.0/share/java/ksqldb/log4j-1.2.17.jar]

$ grep -n log4j.Appender /tmp/debug
1356:[Loaded org.apache.log4j.Appender from file:/opt/confluent/versions/confluent-5.5.0/share/java/ksqldb/log4j-1.2.17.jar]

Both are loaded by the log4j-1.2.17.jar.

Btw, I am using the 5.5.0 from the TAR package which is installed in /opt/confluent/versions/confluent-5.5.0.

How is your installation environment? Could you run the classloader verbose mentioned above? Could you try using the ksqldb 5.5.0 dependency instead? Notice that ksql is renamed to ksqldb.

<dependency>
        <groupId>io.confluent.ksql</groupId>
        <artifactId>ksqldb-udf</artifactId>
        <version>5.5.0</version>
    </dependency>
keweishang commented 4 years ago

@spena Thanks for investigating this issue.

I'm also using 5.5.0 from the TAR package, which is installed in /Users/keweishang/test/production-ksql/confluent-5.5.0/.

Running the verbose java option gives the following output. As you can see, org.apache.log4j.Appender is loaded twice from different jars:

$ grep -n log4j.Appender /tmp/debug
1362:[Loaded org.apache.log4j.Appender from file:/Users/keweishang/test/production-ksql/confluent-5.5.0/share/java/ksqldb/log4j-1.2.17.jar]
5418:[Loaded org.apache.log4j.Appender from file:/Users/keweishang/test/production-ksql/ksql_udfs/udf-example-1.0-SNAPSHOT-jar-with-dependencies.jar]

$ grep -n DailyRollingFileAppender /tmp/debug
1368:[Loaded org.apache.log4j.DailyRollingFileAppender from file:/Users/keweishang/test/production-ksql/confluent-5.5.0/share/java/ksqldb/log4j-1.2.17.jar]

Using ksqldb 5.5.0 reproduces the same error result. I built the jar using the following pom.xml:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>io.kewei</groupId>
    <artifactId>udf-example</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>jar</packaging>

    <name>KSQL UDF / UDAF Quickstart</name>

    <properties>
        <exec.mainClass>my.company.ksqldb.udfdemo.thisisignored</exec.mainClass>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <ksql.version>5.5.0</ksql.version>
        <junit.jupiter.version>5.4.1</junit.jupiter.version>
        <maven.surefire.version>2.22.1</maven.surefire.version>
    </properties>

    <repositories>
        <repository>
            <id>confluent</id>
            <url>http://packages.confluent.io/maven/</url>
        </repository>
        <repository>
            <id>ossrh</id>
            <name>Maven Central</name>
            <url>https://repo1.maven.org/maven2/</url>
        </repository>
    </repositories>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.6.1</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>

            <!--package as one fat jar -->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.5.2</version>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <archive>
                        <manifest>
                            <addClasspath>true</addClasspath>
                            <mainClass>${exec.mainClass}</mainClass>
                        </manifest>
                    </archive>
                </configuration>
                <executions>
                    <execution>
                        <id>assemble-all</id>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

    <dependencies>
        <!-- KSQL dependencies -->
        <dependency>
            <groupId>io.confluent.ksql</groupId>
            <artifactId>ksqldb-udf</artifactId>
            <version>${ksql.version}</version>
        </dependency>
        <!-- Test dependencies -->
        <dependency>
            <groupId>org.junit.jupiter</groupId>
            <artifactId>junit-jupiter</artifactId>
            <version>${junit.jupiter.version}</version>
            <scope>test</scope>
        </dependency>
    </dependencies>
</project>

My java version:

$ java -version
openjdk version "1.8.0_242"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_242-b08)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.242-b08, mixed mode)

Does your UDF fatjar contain the org.apache.log4j.Appender class?

keweishang commented 4 years ago

Excluding explicitly the log4j dependency from the ksqldb-udf dependency solves the problem. Why does ksqldb-udf, as a library, has slf4j-log4j12 (therefore log4j) dependency in the first place? Did you exclude log4j explicitly as well?

        <dependency>
            <groupId>io.confluent.ksql</groupId>
            <artifactId>ksqldb-udf</artifactId>
            <version>${ksql.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>log4j</groupId>
                    <artifactId>log4j</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

Here's the output of mvn dependency:tree:

[INFO] ------------------------< io.kewei:udf-example >------------------------
[INFO] Building KSQL UDF / UDAF Quickstart 1.0-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ udf-example ---
[INFO] io.kewei:udf-example:jar:1.0-SNAPSHOT
[INFO] +- io.confluent.ksql:ksqldb-udf:jar:5.5.0:compile
[INFO] |  +- org.slf4j:slf4j-api:jar:1.7.26:compile
[INFO] |  +- org.slf4j:slf4j-log4j12:jar:1.7.26:compile
[INFO] |  |  \- log4j:log4j:jar:1.2.17:compile
[INFO] |  \- io.confluent:common-utils:jar:5.5.0:compile
spena commented 4 years ago

Seems the issue is how you're packaging the UDAF. If I create the jar with the jar-with-dependencies plugin you're using, then I get problems with the logger.

$ grep -n 'log4j.Appender ' /tmp/debug
1356:[Loaded org.apache.log4j.Appender from file:/opt/confluent/versions/confluent-5.5.0/share/java/ksqldb/log4j-1.2.17.jar]
5440:[Loaded org.apache.log4j.Appender from file:/home/sergio/workspace/projects/issues/classloader_log_5315/udf/target/reverse-udf-1.0.0-jar-with-dependencies.jar]

The logger jars from the jar-with-dependencies is conflicting with the ones loaded/required by KSQL. As you mentioned, excluding them makes it work.

Looking at the code, the slf4j-log4j12 is a transitive dependency brought by other internal libraries used by ksqldb-udf. It is used by KSQL. Although I don't know if it should be excluded to avoid putting in the classpath for custom udfs. Will investigate about this.

keweishang commented 4 years ago

Good to know that you’ve managed to reproduce the error. Please keep me updated on this one.

What was your maven configuration when you didn’t reproduce this error. Sounds like you didn’t exclude the dependency but it worked anyway?

keweishang commented 4 years ago

Hi, @spena do you have any update on this issue?

agavra commented 4 years ago

I think this should have been fixed with #5158 as ksqldb-udf should no longer depend on log4j.

keweishang commented 4 years ago

Sorry for the delay. It seems that the ksqldb-udf 6.0.0 dependency still brings in the io.confluent:confluent-log4j:jar:1.2.17-cp2:compile transitive dependency, which brings in log4j. Can you have a look at this, please? Thanks. @agavra @spena