HashDataInc / bireme

Bireme is an incremental synchronization tool for the Greenplum / HashData data warehouse
https://hashdatainc.github.io/bireme/
Apache License 2.0
137 stars 53 forks source link

mysql数据表同步到postgresql数据库,由于字段列顺序不完全对应导致同步问题 #109

Open dwwang1992 opened 6 years ago

dwwang1992 commented 6 years ago

使用 debezium+kafka+bireme 同步方式

1、数据源mysql表broad,字段顺序为id(id为主键) info city,数据如下: +----+---------+----------+ | id | info | city | +----+---------+----------+ | 1 | record1 | hangzhou | | 2 | record1 | hangzhou | | 3 | c | c | | 4 | c | c | | 5 | c | c | | 6 | c | c | | 7 | c | c | | 8 | c | c | | 9 | c | c | | 10 | c | c | +----+---------+----------+

2、目标库postgresql表broad,字段顺序为city id(id为主键) info(和数据源顺序不完全对应) bireme同步结束后,数据如下 target=> select * from broad; city | id | info
----------+----+--------- hangzhou | 2 | record1 c | 10 | c

出现了部分数据的同步 如果字段顺序完全对应,同步无异常

这种字段顺序不完全对应的出现部分数据同步,如何修复呢?

wangzw commented 6 years ago

It is a bug, which version are you using?

dwwang1992 commented 6 years ago

版本号 bireme-1.0.0 大概什么时间2.0.0可以使用呢

wangzw commented 6 years ago

2.0 发布了第一个测试版。这个问题应该已经修掉了。

dwwang1992 commented 6 years ago

我使用2.0的第一个测试版,配置文件直接从1.0拷贝过来的,启动报错如下: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243) Caused by: java.lang.NullPointerException at cn.hashdata.bireme.Bireme.getTableInfo(Bireme.java:111) at cn.hashdata.bireme.Bireme.start(Bireme.java:283) ... 5 more Cannot start daemon Service exit with a return value of 5

请问如何处理呢

dwwang1992 commented 6 years ago

错误日志logs/bireme.err,输出日志logs/bireme.out如下 19:12:31 INFO Bireme - initialize Bireme daemon 19:12:31 FATAL class cn.hashdata.bireme.Config - Please designate your namespace. 19:12:31 FATAL Bireme - start failed. Message: Please designate your namespace.. 19:12:31 FATAL Bireme - Stack Trace: cn.hashdata.bireme.BiremeException: Please designate your namespace. at cn.hashdata.bireme.Config.fetchDebeziumConfig(Config.java:201) ~[bireme-2.0.0-alpha-1.jar:?] at cn.hashdata.bireme.Config.fetchSourceAndTableMap(Config.java:171) ~[bireme-2.0.0-alpha-1.jar:?] at cn.hashdata.bireme.Config.dataSourceConfig(Config.java:145) ~[bireme-2.0.0-alpha-1.jar:?] at cn.hashdata.bireme.Config.(Config.java:85) ~[bireme-2.0.0-alpha-1.jar:?] at cn.hashdata.bireme.Bireme.parseCommandLine(Bireme.java:97) ~[bireme-2.0.0-alpha-1.jar:?] at cn.hashdata.bireme.Bireme.init(Bireme.java:272) [bireme-2.0.0-alpha-1.jar:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_101] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_101] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_101] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101] at org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:207) [commons-daemon-1.0.15.jar:1.0.15] 19:12:31 INFO Bireme - start Bireme daemon. 19:12:31 INFO Bireme - Start getting metadata of target tables from target database.

配置文件etc/config.properties 已经填写debezium1.kafka.namespace配置项

yinxs2003 commented 6 years ago

2018-08-17 13:48:10,791 main DEBUG LoggerContext[name=764c12b6, org.apache.logging.log4j.core.LoggerContext@408d971b] started OK. 13:48:10 INFO Bireme - initialize Bireme daemon 2018-08-17 13:48:10,798 main DEBUG AsyncLogger.ThreadNameStrategy=CACHED 13:48:10 FATAL class cn.hashdata.bireme.Config - Please designate your namespace. 13:48:10 FATAL Bireme - start failed. Message: Please designate your namespace.. 13:48:10 FATAL Bireme - Stack Trace: cn.hashdata.bireme.BiremeException: Please designate your namespace.

我也是报这个问题,请问解决了吗

yinxs2003 commented 6 years ago

我的配置文件

# target database where the data will sync into.
target.url = jdbc:postgresql://172.22.222.11:5432/aplath
target.user = laputa
target.passwd = laputa

# data source name list, separated by comma.
#data_source = maxwell1, debezium1
data_source = debezium1

## data source "mysql1" type
#maxwell1.type = maxwell
## kafka server which maxwell write binlog into.
#maxwell1.kafka.server = 172.22.222.25:9092
## kafka topic which maxwell write binlog into.
maxwell1.kafka.topic = syncdb
## kafka groupid used for consumer.
#maxwell1.kafka.groupid = bireme

# data source "debezium1"
#debezium1.kafka.namespace=172.22.222.21

debezium1.type = debezium
# kafka server which debezium write into.
debezium1.kafka.server = 172.22.222.25:9092
# kafka groupid used for consumer.
debezium1.kafka.groupid = bireme

# number of threads used for pipeline to drive the porcess
pipeline.thread_pool.size = 5

# number of threads used to transform data source record into target format.
transform.thread_pool.size = 10

# number of threads used to generate load tasks.
merge.thread_pool.size = 10
# interval of generating a load task in milliseconds.
merge.interval = 10000
# max tuple size for a load task
merge.batch.size = 50000

# JDBC connection pool size of target database.
loader.conn_pool.size = 10
# queue size of task for each table which is waiting for load.
loader.task_queue.size = 2

# application performance monitor report type: "none", "console", "jmx"
metrics.reporter=jmx
# interval of console APM reporter.
metrics.reporter.console.interval = 10

# set the IP address for bireme state server.
state.server.addr = 0.0.0.0
# set the port for bireme state server.
state.server.port = 8080
wangzw commented 6 years ago

https://github.com/HashDataInc/bireme/tree/master/integration_test/debezium/etc

2.0和1.0的配置不兼容,以上链接是个示例