icsnju / I2EC

3 stars 0 forks source link

从oracle导入数据到kafka #5

Open yuping-nju opened 5 years ago

yuping-nju commented 5 years ago

@usernamehcx 正确配置connector,测试读入性能

yuping-nju commented 5 years ago

@usernamehcx 添加唯一主键列的问题是否已解决?还是有其它方案?

usernamehcx commented 5 years ago

oracle -> kafka -> flink -> hdfs(单节点环境)

oracle -> kafka

用的是Kafka JDBC Connector oracle to kafka 配置文件: source.properties

name=test-oracle-jdbc-autoincrement
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
poll.interval.ms=10000000
connection.password=EAST_TEST
connection.url=jdbc:oracle:thin:@nodeIP:PORT:XE
connection.user=EAST_TEST
table.whitelist=T_KJ_ZZQKM
mode=bulk
topic.prefix=test-

运行命令:

/usr/bin/connect-standalone  /etc/kafka/connect-standalone.properties source.properties

导入时间: T_KJ_ZZQKM(22561642): 40min

Kafka -> flink -> hdfs

目前只是简单的从kafka中读入数据,未做任何处理,然后将数据导入到hdfs中

运行时间: T_KJ_ZZQKM(22561642): 39min

下一步计划

尝试多节点并行导入数据,测试读入导出到性能

yuping-nju commented 5 years ago

@usernamehcx 尝试进行简单校验