confluentinc / kafka-connect-datagen

Connector that generates data for demos
Apache License 2.0
20 stars 87 forks source link

CCLOG-1921 Fix for CVE-2020-36518 #114

Closed shaikzakiriitm closed 2 years ago

shaikzakiriitm commented 2 years ago

Problem

https://nvd.nist.gov/vuln/detail/CVE-2020-36518

Solution

Upgraded the common version, to bring in the version of jackson-databind which has the fix for this cve. Also had to import confluent-log4j to avoid the following error in tests, where log4j dependency wasn't being pulled in.

[ERROR] Tests run: 14, Failures: 0, Errors: 14, Skipped: 0, Time elapsed: 0.01 s <<< FAILURE! - in io.confluent.kafka.connect.datagen.DatagenTaskTest
[ERROR] io.confluent.kafka.connect.datagen.DatagenTaskTest.shouldGenerateFilesForUsersQuickstart  Time elapsed: 0.001 s  <<< ERROR!
java.lang.NoClassDefFoundError: org/apache/log4j/Level
        at io.confluent.kafka.connect.datagen.DatagenTaskTest.<clinit>(DatagenTaskTest.java:58)
Caused by: java.lang.ClassNotFoundException: org.apache.log4j.Level
        at io.confluent.kafka.connect.datagen.DatagenTaskTest.<clinit>(DatagenTaskTest.java:58)

Also found spotbugs validate check failure listed below

[INFO] --- spotbugs-maven-plugin:4.5.3.0:check (default) @ kafka-connect-datagen ---
[INFO] BugInstance size is 2
[INFO] Error size is 0
[INFO] Total bugs: 2
[ERROR] High: Random object created and used only once in io.confluent.kafka.connect.datagen.DatagenTask.start(Map) [io.confluent.kafka.connect.datagen.DatagenTask] At DatagenTask.java:[line 127] DMI_RANDOM_USED_ONLY_ONCE
[ERROR] High: Random object created and used only once in io.confluent.kafka.connect.datagen.DatagenTask.start(Map) [io.confluent.kafka.connect.datagen.DatagenTask, io.confluent.kafka.connect.datagen.DatagenTask, io.confluent.kafka.connec
t.datagen.DatagenTask] At DatagenTask.java:[line 123]Another occurrence at DatagenTask.java:[line 127]Another occurrence at DatagenTask.java:[line 136] DMI_RANDOM_USED_ONLY_ONCE

Addressed by making random instance variable a constant (final) per task instance. Verified that using same seed in user configs, two connector instances generate same stream of data.

Does this solution apply anywhere else?
If yes, where?

applies to other connectors as well. Also in all the branches of datagen.

Test Strategy

All existing unit tests succeed. Manually verified the jackson-databind dependency being pulled in.

*[cclog-1921][~/workspace/kafka-connect-datagen]$ mvn dependency:tree | grep 'jackson-databind'                                                                                                                                    cclog-1921
[INFO] |  |  +- com.fasterxml.jackson.core:jackson-databind:jar:2.13.2.2:compile

Also, deployed the snapshot version of connector with these (this pr's) changes locally and verified that two instances of the connector with same seed generate same data stream with USERS quickstart.

Testing done:

Release Plan

rishabhbits038 commented 2 years ago

@shaikzakiriitm Shouldn't the confluent-log4j have scope <scope>test</scope> if its only needed to be included for the tests?

shaikzakiriitm commented 2 years ago

@shaikzakiriitm I think we should not include confluent-log4j as it is replaced with reload4j recently in other connectors due to this - https://confluentinc.atlassian.net/wiki/spaces/~913794610/pages/2772764716/Backport+reload4j+to+replace+Confluent-log4j

Fixed it, migrated from slf4j-log4j2 to slf4j-reload4j

shaikzakiriitm commented 2 years ago

@shaikzakiriitm Shouldn't the confluent-log4j have scope <scope>test</scope> if its only needed to be included for the tests?

got rid of confluent-log4j now.