Closed naveenmall11 closed 1 year ago
Is there a reason to go with this scope limited fix instead of #652 ?
Is there a reason to go with this scope limited fix instead of #652 ?
@janjwerner-confluent We are prioritizing the CVEs raised on or before 31st Dec'22. After merging this PR, we will look to address the other CVEs.
LGTM post testing on docker playground.
CONNECTOR_JAR is set with /home/nmall/kafka-connect-hdfs/target/kafka-connect-hdfs-10.1.16-SNAPSHOT.jar
/usr/share/confluent-hub-components/confluentinc-kafka-connect-hdfs/lib/kafka-connect-hdfs-10.2.0.jar
10:56:49 ℹ️ 👷🎯 Building Docker image vdesabou/kafka-docker-playground-connect:CP-7.3.2-kafka-connect-hdfs-10.1.16-SNAPSHOT.jar
10:56:49 ℹ️ Remplacing kafka-connect-hdfs-10.2.0.jar by kafka-connect-hdfs-10.1.16-SNAPSHOT.jar
[+] Building 0.6s (7/7) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 244B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/vdesabou/kafka-docker-playground-connect:7.3.2 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 155.76kB 0.0s
=> [1/2] FROM docker.io/vdesabou/kafka-docker-playground-connect:7.3.2 0.1s
=> [2/2] COPY kafka-connect-hdfs-10.1.16-SNAPSHOT.jar /usr/share/confluent-hub-components/confluentinc-kafka-con 0.3s
=> exporting to image 0.1s
=> => exporting layers 0.0s
=> => writing image sha256:b58e39145e001fc738ceda2eb915d4f4f7657ef84b6e7a5e5d72461a4205d014 0.0s
=> => naming to docker.io/vdesabou/kafka-docker-playground-connect:CP-7.3.2-kafka-connect-hdfs-10.1.16-SNAPSHOT. 0.0s
10:56:50 ℹ️ 🛑 Grafana is disabled
10:56:50 ℹ️ 🛑 kcat is disabled
10:56:50 ℹ️ 🛑 conduktor is disabled
[+] Running 16/16
⠿ Container hive-metastore-postgresql Removed 5.4s
⠿ Container datanode Removed 13.9s
⠿ Container connect Removed 2.1s
⠿ Container namenode Removed 13.9s
⠿ Container schema-registry Removed 3.0s
⠿ Container hive-server Removed 11.8s
⠿ Container presto-coordinator Removed 3.1s
⠿ Container hive-metastore Removed 3.2s
⠿ Container zookeeper Removed 2.6s
⠿ Container broker Removed 11.6s
⠿ Container control-center Removed 14.1s
⠿ Container ksqldb-cli Removed 0.8s
⠿ Container ksqldb-server Removed 14.0s
⠿ Volume plaintext_namenode Removed 0.0s
⠿ Volume plaintext_datanode Removed 0.0s
⠿ Network plaintext_default Removed 0.2s
[+] Running 16/16
⠿ Network plaintext_default Created 0.0s
⠿ Volume "plaintext_datanode" Created 0.0s
⠿ Volume "plaintext_namenode" Created 0.0s
⠿ Container broker Started 3.4s
⠿ Container zookeeper Started 2.8s
⠿ Container namenode Started 3.3s
⠿ Container hive-server Started 1.9s
⠿ Container presto-coordinator Started 2.5s
⠿ Container hive-metastore-postgresql Started 2.8s
⠿ Container datanode Started 2.4s
⠿ Container hive-metastore Started 3.3s
⠿ Container schema-registry Started 4.2s
⠿ Container connect Started 6.3s
⠿ Container ksqldb-server Started 8.5s
⠿ Container control-center Started 8.7s
⠿ Container ksqldb-cli Started 10.2s
10:57:33 ℹ️ 📝 To see the actual properties file, use cli command playground get-properties -c <container>
10:57:33 ℹ️ ✨ If you modify a docker-compose file and want to re-create the container(s), run cli command playground recreate-container
10:57:33 ℹ️ ⌛ Waiting up to 480 seconds for connect to start
10:59:40 ℹ️ 🚦 connect is started!
10:59:40 ℹ️ 🔧 You can use ksqlDB with CLI using:
10:59:40 ℹ️ docker exec -i ksqldb-cli ksql http://ksqldb-server:8088
10:59:40 ℹ️ 📊 JMX metrics are available locally on those ports:
10:59:40 ℹ️ - zookeeper : 9999
10:59:40 ℹ️ - broker : 10000
10:59:40 ℹ️ - schema-registry : 10001
10:59:40 ℹ️ - connect : 10002
10:59:40 ℹ️ - ksqldb-server : 10003
11:01:20 ℹ️ Creating HDFS Sink connector
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1710 100 749 100 961 54 69 0:00:13 0:00:13 --:--:-- 180
{
"name": "hdfs-sink",
"config": {
"connector.class": "io.confluent.connect.hdfs.HdfsSinkConnector",
"tasks.max": "1",
"topics": "test_hdfs",
"store.url": "hdfs://namenode:8020",
"flush.size": "3",
"hadoop.conf.dir": "/etc/hadoop/",
"partitioner.class": "io.confluent.connect.hdfs.partitioner.FieldPartitioner",
"partition.field.name": "f1",
"rotate.interval.ms": "120000",
"logs.dir": "/tmp",
"hive.integration": "true",
"hive.metastore.uris": "thrift://hive-metastore:9083",
"hive.database": "testhive",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://schema-registry:8081",
"schema.compatibility": "BACKWARD",
"name": "hdfs-sink"
},
"tasks": [],
"type": "sink"
}
11:02:04 ℹ️ Sending messages to topic test_hdfs
11:02:28 ℹ️ Listing content of /topics/test_hdfs in HDFS
Found 10 items
drwxr-xr-x - appuser supergroup 0 2023-03-29 05:32 /topics/test_hdfs/f1=value1
drwxrwxrwx - appuser supergroup 0 2023-03-29 05:32 /topics/test_hdfs/f1=value10
drwxr-xr-x - appuser supergroup 0 2023-03-29 05:32 /topics/test_hdfs/f1=value2
drwxr-xr-x - appuser supergroup 0 2023-03-29 05:32 /topics/test_hdfs/f1=value3
drwxr-xr-x - appuser supergroup 0 2023-03-29 05:32 /topics/test_hdfs/f1=value4
drwxr-xr-x - appuser supergroup 0 2023-03-29 05:32 /topics/test_hdfs/f1=value5
drwxr-xr-x - appuser supergroup 0 2023-03-29 05:32 /topics/test_hdfs/f1=value6
drwxr-xr-x - appuser supergroup 0 2023-03-29 05:32 /topics/test_hdfs/f1=value7
drwxr-xr-x - appuser supergroup 0 2023-03-29 05:32 /topics/test_hdfs/f1=value8
drwxr-xr-x - appuser supergroup 0 2023-03-29 05:32 /topics/test_hdfs/f1=value9
11:02:31 ℹ️ Getting one of the avro files locally and displaying content with avro-tools
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
{"f1":"value1"}
11:02:36 ℹ️ Check data with beeline
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.4/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 2.3.2 by Apache Hive
beeline> !connect jdbc:hive2://hive-server:10000/testhive
Enter username for jdbc:hive2://hive-server:10000/testhive: Connecting to jdbc:hive2://hive-server:10000/testhive
hive
Enter password for jdbc:hive2://hive-server:10000/testhive: ****
Connected to: Apache Hive (version 2.3.2)
Driver: Hive JDBC (version 2.3.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://hive-server:10000/testhive> show create table test_hdfs;
+----------------------------------------------------+
| createtab_stmt |
+----------------------------------------------------+
| CREATE EXTERNAL TABLE `test_hdfs`( |
| ) |
| PARTITIONED BY ( |
| `f1` string COMMENT '') |
| ROW FORMAT SERDE |
| 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' |
| STORED AS INPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' |
| OUTPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' |
| LOCATION |
| 'hdfs://namenode:8020/topics/test_hdfs' |
| TBLPROPERTIES ( |
| 'avro.schema.literal'='{"type":"record","name":"ConnectDefault","namespace":"io.confluent.connect.avro","fields":[]}', |
| 'transient_lastDdlTime'='1680067945') |
+----------------------------------------------------+
15 rows selected (2.135 seconds)
0: jdbc:hive2://hive-server:10000/testhive> select * from test_hdfs;
+---------------+
| test_hdfs.f1 |
+---------------+
| value1 |
| value2 |
| value3 |
| value4 |
| value5 |
| value6 |
| value7 |
| value8 |
| value9 |
+---------------+
9 rows selected (2.944 seconds)
0: jdbc:hive2://hive-server:10000/testhive> Closing: 0: jdbc:hive2://hive-server:10000/testhive
| value1 |
Problem
https://confluentinc.atlassian.net/browse/CC-18965 https://confluentinc.atlassian.net/browse/CCMSG-2266 https://confluentinc.atlassian.net/browse/CCMSG-2248
Twistlock scan link: https://twistlock.tools.confluent-internal.io/#!/monitor/vulnerabilities/images/ci?search=Confluent%20Public%20Repo%20PR%20builder%2Fkafka-connect-hdfs%2FPR-656
Does this solution apply anywhere else?
If yes, where?
Test Strategy
Testing done:
Release Plan