Closed khsoneji closed 7 months ago
CVE fixes: https://confluentinc.atlassian.net/browse/CC-22575 https://confluentinc.atlassian.net/browse/CC-22217
Updated version of ivy and snappy-java to version without cves.
CVE is not showing in the PR scan: https://twistlock.tools.confluent-internal.io/#!/monitor/vulnerabilities/images/ci?search=Confluent%20Public%20Repo%20PR%20builder%2Fkafka-connect-hdfs%2FPR-672
Dependency tree, the version is updated
ksoneji@T9X6X34XJG kafka-connect-hdfs % mvn dependency:tree | grep -e snappy -e ivy [INFO] | | \- org.xerial.snappy:snappy-java:jar:1.1.10.4:compile [INFO] | | +- org.apache.ivy:ivy:jar:2.5.2:compile
Unit tests:
[INFO] [INFO] Results: [INFO] [INFO] Tests run: 151, Failures: 0, Errors: 0, Skipped: 0 [INFO]
Docker playground
Docker playground test :point_down::skin-tone-3: ~/gitrepos/kafka-docker-playground/connect/connect-hdfs2-sink master wip -------------------------------------------------------------------------------------------------------------------------------------------------------- 3s 10:43:50 > playground run -f hdfs2-sink.sh --connector-zip ~/gitrepos/kafka-connect-hdfs/target/components/packages/confluentinc-kafka-connect-hdfs-10.2.5-SNAPSHOT.zip 10:45:13 ℹ️ 🚀 Running example with flags 10:45:13 ℹ️ ⛳ Flags used are --connector-zip=/Users/vbalani/gitrepos/kafka-connect-hdfs/target/components/packages/confluentinc-kafka-connect-hdfs-10.2.5-SNAPSHOT.zip 10:45:14 ℹ️ 💀 Kill all docker containers 10:45:14 ℹ️ #################################################### 10:45:14 ℹ️ 🚀 Executing hdfs2-sink.sh in dir . 10:45:14 ℹ️ #################################################### 10:45:14 ℹ️ 💫 Using default CP version 7.5.0 10:45:14 ℹ️ 🎓 Use --tag option to specify different version, see https://kafka-docker-playground.io/#/how-to-use?id=🎯-for-confluent-platform-cp 10:45:14 ℹ️ 🎯🤐 CONNECTOR_ZIP (--connector-zip option) is set with /Users/vbalani/gitrepos/kafka-connect-hdfs/target/components/packages/confluentinc-kafka-connect-hdfs-10.2.5-SNAPSHOT.zip 10:45:15 ℹ️ 🧰 Checking if Docker image confluentinc/cp-server-connect-base:7.5.0 contains additional tools 10:45:15 ℹ️ 🧰 it can take a while if image is downloaded for the first time 10:45:15 ℹ️ 🎱 Installing connector from zip confluentinc-kafka-connect-hdfs-10.2.5-SNAPSHOT.zip Running in a "--no-prompt" mode Implicit acceptance of the license below: Confluent Community License http://www.confluent.io/confluent-community-license Installing a component Kafka Connect HDFS 10.2.5-SNAPSHOT, provided by Confluent, Inc. from the local file: /tmp/confluentinc-kafka-connect-hdfs-10.2.5-SNAPSHOT.zip into directory: /usr/share/confluent-hub-components Adding installation directory to plugin path in the following files: /etc/kafka/connect-distributed.properties /etc/kafka/connect-standalone.properties /etc/schema-registry/connect-avro-distributed.properties /etc/schema-registry/connect-avro-standalone.properties Completed 10:47:50 ℹ️ Getting hive-jdbc-3.1.2-standalone.jar --2023-11-16 10:47:51-- https://repo1.maven.org/maven2/org/apache/hive/hive-jdbc/3.1.2/hive-jdbc-3.1.2-standalone.jar Resolving repo1.maven.org (repo1.maven.org)... 199.232.192.209, 199.232.196.209, 2a04:4e42:4c::209, ... Connecting to repo1.maven.org (repo1.maven.org)|199.232.192.209|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 72420147 (69M) [application/java-archive] Saving to: 'hive-jdbc-3.1.2-standalone.jar' hive-jdbc-3.1.2-standalone.jar 100%[=========================================================================================================================================>] 69.06M 17.8MB/s in 4.7s 2023-11-16 10:47:58 (14.6 MB/s) - 'hive-jdbc-3.1.2-standalone.jar' saved [72420147/72420147] 10:47:59 ℹ️ 🛑 control-center is disabled 10:47:59 ℹ️ 🛑 ksqldb is disabled 10:47:59 ℹ️ 🛑 Grafana is disabled 10:47:59 ℹ️ 🛑 kcat is disabled 10:47:59 ℹ️ 🛑 conduktor is disabled [+] Building 0.0s (0/0) docker:desktop-linux [+] Running 3/0 ✔ Volume plaintext_datanode Removed 0.0s ✔ Volume plaintext_namenode Removed 0.0s ✔ Network plaintext_default Removed 0.1s [+] Building 0.0s (0/0) docker:desktop-linux [+] Running 13/13 ✔ Network plaintext_default Created 0.0s ✔ Volume "plaintext_namenode" Created 0.0s ✔ Volume "plaintext_datanode" Created 0.0s ✔ Container hive-server Started 0.1s ✔ Container zookeeper Started 0.1s ✔ Container namenode Started 0.1s ✔ Container datanode Started 0.1s ✔ Container broker Started 0.1s ✔ Container hive-metastore-postgresql Started 0.1s ✔ Container hive-metastore Started 0.1s ✔ Container presto-coordinator Started 0.1s ✔ Container schema-registry Started 0.1s ✔ Container connect Started 0.0s 10:48:03 ℹ️ 📝 To see the actual properties file, use cli command playground get-properties -c <container> 10:48:03 ℹ️ ✨ If you modify a docker-compose file and want to re-create the container(s), run cli command playground container recreate 10:48:03 ℹ️ ⌛ Waiting up to 480 seconds for connect to start 10:49:27 ℹ️ 🚦 connect is started! 10:49:27 ℹ️ 📊 JMX metrics are available locally on those ports: 10:49:27 ℹ️ - zookeeper : 9999 10:49:27 ℹ️ - broker : 10000 10:49:27 ℹ️ - schema-registry : 10001 10:49:27 ℹ️ - connect : 10002 10:49:39 ℹ️ Creating HDFS Sink connector 10:49:41 ℹ️ 🛠️ Creating connector hdfs-sink 10:49:41 ℹ️ ✅ Connector hdfs-sink was successfully created 10:49:41 ℹ️ 💈 Configuration is { "connector.class": "io.confluent.connect.hdfs.HdfsSinkConnector", "flush.size": "3", "hadoop.conf.dir": "/etc/hadoop/", "hive.database": "testhive", "hive.integration": "true", "hive.metastore.uris": "thrift://hive-metastore:9083", "key.converter": "org.apache.kafka.connect.storage.StringConverter", "logs.dir": "/tmp", "partitioner.class": "io.confluent.connect.storage.partitioner.DefaultPartitioner", "rotate.interval.ms": "120000", "schema.compatibility": "BACKWARD", "store.url": "hdfs://namenode:8020", "tasks.max": "1", "topics": "test_hdfs", "value.converter": "io.confluent.connect.avro.AvroConverter", "value.converter.schema.registry.url": "http://schema-registry:8081" } 10:49:41 ℹ️ 🥁 Waiting a few seconds to get new status 10:49:49 ℹ️ 🧩 Displaying connector status for hdfs-sink Name Status Tasks Stack Trace ----------------------------------------------------------------------------------------------------------------------------- hdfs-sink ✅ RUNNING 0:🟢 RUNNING[connect] - ------------------------------------------------------------------------------------------------------------- 10:49:49 ℹ️ Sending messages to topic test_hdfs 10:49:50 ℹ️ 🔮 schema was identified as avro 10:49:50 ℹ️ ✨ generating data... 10:49:50 ℹ️ ☢️ --forced-value is set 10:49:50 ℹ️ ✨ 10 records were generated based on --forced-value (only showing first 10), took: 0min 1sec {"f1":"value1"} {"f1":"value2"} {"f1":"value3"} {"f1":"value4"} {"f1":"value5"} {"f1":"value6"} {"f1":"value7"} {"f1":"value8"} {"f1":"value9"} {"f1":"value10"} 10:49:55 ℹ️ 💯 Get number of records in topic test_hdfs 0 10:49:55 ℹ️ 📤 producing 10 records to topic test_hdfs [2023-11-16 05:19:57,425] WARN MessageReader is deprecated. Please use org.apache.kafka.tools.api.RecordReader instead (kafka.tools.ConsoleProducer$) 10:49:58 ℹ️ 📤 produced 10 records to topic test_hdfs, took: 0min 3sec 10:49:59 ℹ️ 💯 Get number of records in topic test_hdfs 10 10:50:13 ℹ️ Listing content of /topics/test_hdfs/partition=0 in HDFS Found 3 items -rw-r--r-- 3 appuser supergroup 213 2023-11-16 05:19 /topics/test_hdfs/partition=0/test_hdfs+0+0000000000+0000000002.avro -rw-r--r-- 3 appuser supergroup 213 2023-11-16 05:19 /topics/test_hdfs/partition=0/test_hdfs+0+0000000003+0000000005.avro -rw-r--r-- 3 appuser supergroup 213 2023-11-16 05:19 /topics/test_hdfs/partition=0/test_hdfs+0+0000000006+0000000008.avro 10:50:15 ℹ️ Getting one of the avro files locally and displaying content with avro-tools Successfully copied 2.05kB to /tmp/ 23/11/16 05:20:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable {"f1":"value1"} {"f1":"value2"} {"f1":"value3"} 10:50:19 ℹ️ Check data with beeline SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.4/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Beeline version 2.3.2 by Apache Hive beeline> !connect jdbc:hive2://hive-server:10000/testhive Enter username for jdbc:hive2://hive-server:10000/testhive: hConnecting to jdbc:hive2://hive-server:10000/testhive ive Enter password for jdbc:hive2://hive-server:10000/testhive: **** Connected to: Apache Hive (version 2.3.2) Driver: Hive JDBC (version 2.3.2) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://hive-server:10000/testhive> show create table test_hdfs; +----------------------------------------------------+ | createtab_stmt | +----------------------------------------------------+ | CREATE EXTERNAL TABLE `test_hdfs`( | | `f1` string COMMENT '') | | PARTITIONED BY ( | | `partition` string COMMENT '') | | ROW FORMAT SERDE | | 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' | | STORED AS INPUTFORMAT | | 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' | | OUTPUTFORMAT | | 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' | | LOCATION | | 'hdfs://namenode:8020/topics/test_hdfs' | | TBLPROPERTIES ( | | 'avro.schema.literal'='{"type":"record","name":"ConnectDefault","namespace":"io.confluent.connect.avro","fields":[{"name":"f1","type":"string"}]}', | | 'transient_lastDdlTime'='1700111999') | +----------------------------------------------------+ 15 rows selected (1.023 seconds) 0: jdbc:hive2://hive-server:10000/testhive> select * from test_hdfs; +---------------+----------------------+ | test_hdfs.f1 | test_hdfs.partition | +---------------+----------------------+ | value1 | 0 | | value2 | 0 | | value3 | 0 | | value4 | 0 | | value5 | 0 | | value6 | 0 | | value7 | 0 | | value8 | 0 | | value9 | 0 | +---------------+----------------------+ 9 rows selected (1.471 seconds) 0: jdbc:hive2://hive-server:10000/testhive> Closing: 0: jdbc:hive2://hive-server:10000/testhive | value1 | 0 | 10:50:23 ℹ️ #################################################### 10:50:23 ℹ️ ✅ RESULT: SUCCESS for hdfs2-sink.sh (took: 5min 9sec - ) 10:50:23 ℹ️ #################################################### 10:50:27 ℹ️ ✨ --connector flag was not provided, applying command to all connectors 10:50:27 ℹ️ 🧩 Displaying connector status for hdfs-sink Name Status Tasks Stack Trace ----------------------------------------------------------------------------------------------------------------------------- hdfs-sink ✅ RUNNING 0:🟢 RUNNING[connect] - ------------------------------------------------------------------------------------------------------------- 10:50:32 ℹ️ 🗯️ Version currently used for confluentinc-kafka-connect-hdfs is not latest 10:50:32 ℹ️ Current "🔢 v10.2.5-SNAPSHOT - 📅 release date: 2023-11-16" 10:50:32 ℹ️ Latest on Hub "🔢 v10.2.4 - 📅 release date: 2023-09-28" 10:50:32 ℹ️ 🌐 documentation is available at:
All committers have signed the CLA.
Problem
CVE fixes: https://confluentinc.atlassian.net/browse/CC-22575 https://confluentinc.atlassian.net/browse/CC-22217
Solution
Updated version of ivy and snappy-java to version without cves.
Does this solution apply anywhere else?
If yes, where?
Test Strategy
CVE is not showing in the PR scan: https://twistlock.tools.confluent-internal.io/#!/monitor/vulnerabilities/images/ci?search=Confluent%20Public%20Repo%20PR%20builder%2Fkafka-connect-hdfs%2FPR-672
Dependency tree, the version is updated
Unit tests:
Docker playground
Testing done:
Release Plan