snowflakedb / snowflake-kafka-connector

Snowflake Kafka Connector (Sink Connector)
Apache License 2.0
140 stars 98 forks source link

Logging Issues #953

Open c0desurfer opened 1 month ago

c0desurfer commented 1 month ago

I was going through the issues here and found https://github.com/snowflakedb/snowflake-kafka-connector/issues/785 and https://community.snowflake.com/s/article/How-to-avoid-the-log-INFO-messages-in-the-Kafka-connector-getting-printed-on-to-the-console-while-connecting-to-Snowflake. We have a very similar problem. Our log4j configuration looks like this.

log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} %p %X{connector.context}%m (%c) [%t]%n
connect.root.logger.level=ERROR
log4j.rootLogger=${connect.root.logger.level}, CONSOLE
log4j.logger.org.apache.zookeeper=ERROR
log4j.logger.org.I0Itec.zkclient=ERROR
log4j.logger.org.reflections=ERROR
log4j.logger.net.snowflake=ERROR
log4j.logger.com.snowflake=ERROR

We would like the SnowflakeSinkConnector to stop logging at INFO level and additionally to include the name of the connector from which the logs are pushed. Currently if we have multiple projects and connectors it is impossible to say which connector has a problem and tons of logs entries with log level INFO are produced.

The log entries look like follows.

Oct 09, 2024 2:20:53 PM net.snowflake.client.jdbc.cloud.storage.SnowflakeAzureClient upload
INFO: Uploaded data from input stream to Azure location: streaming-ingest. It took 101 ms with 0 retries
Oct 09, 2024 2:20:59 PM net.snowflake.client.jdbc.cloud.storage.SnowflakeAzureClient createSnowflakeAzureClient
INFO: Initializing Snowflake Azure client with encryption: true

We are running this together with Strimzi on OpenShift and logging into a file is not an option.

Funnily enough there is this Confluence article at https://www.confluent.io/blog/kafka-connect-improvements-in-apache-kafka-2-3/ which states this.

Probably second in top frustrations with Kafka Connect behind the rebalance issue (which has greatly improved as shown above) is the difficulty in determining in the Kafka Connect worker log which message belongs to which connector.

And still the Snowflake connector somehow manages to insert log entries which can't be assigned to a specific connector instance.

It seems that this at https://github.com/snowflakedb/snowflake-kafka-connector/issues/263 partly solves my problem. I can set the log level to ERROR successfully. EDIT: Nope, still INFO log entries.

And there is another similar issue at https://github.com/snowflakedb/snowflake-kafka-connector/issues/87.

Alright, the solution (actually a workaround) is inside https://github.com/snowflakedb/snowflake-jdbc/issues/1134. You have to set -Dnet.snowflake.jdbc.loggerImpl=net.snowflake.client.log.SLF4JLogger for the Snowlake JDBC lib to respect the log settings.

simonepm commented 1 month ago

I confirm this behavior: log4j logger settings completely ignored despite being in the classpath, with the connector writing in the console everything as plain text when using Strimzi.

c0desurfer commented 1 month ago

Yes, because of https://github.com/snowflakedb/snowflake-jdbc/blob/master/src/main/java/net/snowflake/client/log/SFLoggerFactory.java#L46-L66 you have to set -Dnet.snowflake.jdbc.loggerImpl=net.snowflake.client.log.SLF4JLogger for the lib to respect the settings. At least it seems to work for me.

sfc-gh-gjachimko commented 1 month ago

Thank you for your comments. We have a work in our backlog regarding updating documentation when it comes to setting up logging and providing some basic recommendations for various scenarios. Please keep in mind, that the logging we can provide is limited to the whatever hosting process is allowing us to do. I could also recommend to look at https://docs.confluent.io/platform/current/connect/logging.html#change-the-log-level-for-a-specific-connector- there are some confluent tips on how to change some logging parameters at runtime.