flinkx1.12 mysql同步hive 无法断点续传（yarn-perjob模式）

biandou1313 commented 2 years ago

Search before asking

[X] I had searched in the issues and found no similar issues.

What happened

1660201162(1) { "job": { "setting": { "errorLimit": {}, "speed": { "channel": 1, "bytes": 0 }, "log": { "isLogger": false } }, "content": [ { "reader": { "name": "mysqlreader", "parameter": { "column": [ { "name": "id", "type": "INT", "precision": 10, "columnDisplaySize": 10 }, { "name": "address", "type": "VARCHAR"

                        }
                    ],
                    "username": "root",
                    "password": "123456",
                    "increColumn":"id",

                    "connection": [
                        {
                            "table": [
                                "mysqlreader2"
                            ],
                            "jdbcUrl": [
                                "jdbc:mysql://172.18.8.77:3306/zk_test"
                            ]
                        }
                    ],
                    "dataSourceId": 21
                }
            },
            "writer": {
                "name": "hivewriter",
                "parameter": {
                    "jdbcUrl": "jdbc:hive2://172.18.8.208:10000/Vasyslink_yag001",
                    "fileType": "text",
                    "fieldDelimiter": "\t",
                    "writeMode": "append",
                    "charsetName": "UTF-8",
                    "maxFileSize": 1073741824,
                    "tablesColumn": "{\"dept22\":[{\"key\":\"deptno\",\"type\":\"int\",\"precision\":10,\"columnDisplaySize\":11},{\"key\":\"address\",\"type\":\"string\"}]}",
                    "defaultFS": "hdfs://172.18.8.207:8020",
                    "dataSourceId": 15,
                    "partition": "pt",
                    "partitionType": "USERDEFINED",
                    "partitionValue": "2022041000"
                }
            }
        }
    ]
}

}

What you expected to happen

下次执行从上次记录位置开始执行

How to reproduce

执行json

{ "job": { "setting": { "errorLimit": {}, "speed": { "channel": 1, "bytes": 0 }, "log": { "isLogger": false } }, "content": [ { "reader": { "name": "mysqlreader", "parameter": { "column": [ { "name": "id", "type": "INT", "precision": 10, "columnDisplaySize": 10 }, { "name": "address", "type": "VARCHAR"

                        }
                    ],
                    "username": "root",
                    "password": "123456",
                    "increColumn":"id",

                    "connection": [
                        {
                            "table": [
                                "mysqlreader2"
                            ],
                            "jdbcUrl": [
                                "jdbc:mysql://172.18.8.77:3306/zk_test"
                            ]
                        }
                    ],
                    "dataSourceId": 21
                }
            },
            "writer": {
                "name": "hivewriter",
                "parameter": {
                    "jdbcUrl": "jdbc:hive2://172.18.8.208:10000/Vasyslink_yag001",
                    "fileType": "text",
                    "fieldDelimiter": "\t",
                    "writeMode": "append",
                    "charsetName": "UTF-8",
                    "maxFileSize": 1073741824,
                    "tablesColumn": "{\"dept22\":[{\"key\":\"deptno\",\"type\":\"int\",\"precision\":10,\"columnDisplaySize\":11},{\"key\":\"address\",\"type\":\"string\"}]}",
                    "defaultFS": "hdfs://172.18.8.207:8020",
                    "dataSourceId": 15,
                    "partition": "pt",
                    "partitionType": "USERDEFINED",
                    "partitionValue": "2022041000"
                }
            }
        }
    ]
}

}

Anything else

No response

Version

1.12_release

Are you willing to submit PR?

[X] Yes I am willing to submit a PR!

Code of Conduct

[X] I agree to follow this project's Code of Conduct

Paddy0523 commented 2 years ago

断点续传是指作业失败的时候，从指定的checkpoint进行恢复。相关的知识点可以查阅文档：https://dtstack.github.io/chunjun/documents/f29c0d86-f41a-5de1-a705-6dc2b6df91fb

从你的描述中无法判断你是想要断点续传还是想要做增量同步增量同步文档：https://dtstack.github.io/chunjun/documents/d1b20bf7-fab2-5a56-8f4d-6fb13ce9fec0

biandou1313 commented 2 years ago

说错了增量同步

------------------ 原始邮件 ------------------ 发件人: "Paddy @.>; 发送时间: 2022年8月11日(星期四) 下午4:03 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [DTStack/chunjun] flinkx1.12 mysql同步hive 无法断点续传（yarn-perjob模式） (Issue #1139)

断点续传是指作业失败的时候，从指定的checkpoint进行恢复。相关的知识点可以查阅文档：https://dtstack.github.io/chunjun/documents/f29c0d86-f41a-5de1-a705-6dc2b6df91fb

从你的描述中无法判断你是想要断点续传还是想要做增量同步增量同步文档：https://dtstack.github.io/chunjun/documents/d1b20bf7-fab2-5a56-8f4d-6fb13ce9fec0

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Paddy0523 commented 2 years ago

增量同步具体用法可以看一下文档，不依赖taier的话需要手动填写一下startlocation

biandou1313 commented 2 years ago

我想问一下大佬 mysql 同步hive 如何提升性能同样服务器和数据库一张表9000万+ sqoop 只需要10分钟但是flinkx 1.12需要3个半小时（都是mysql—hive）

------------------ 原始邮件 ------------------ 发件人: "Paddy @.>; 发送时间: 2022年8月11日(星期四) 晚上7:10 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [DTStack/chunjun] flinkx1.12 mysql同步hive 无法断点续传（yarn-perjob模式） (Issue #1139)

增量同步具体用法可以看一下文档，不依赖taier的话需要手动填写一下startlocation

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Paddy0523 commented 2 years ago

建议用最新的版本跑一下新版本对性能有过一次优化

biandou1313 commented 2 years ago

1.15-bete版本么

------------------ 原始邮件 ------------------ 发件人: "Paddy @.>; 发送时间: 2022年8月12日(星期五) 上午10:38 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [DTStack/chunjun] flinkx1.12 mysql同步hive 无法断点续传（yarn-perjob模式） (Issue #1139)

建议用最新的版本跑一下新版本对性能有过一次优化

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Paddy0523 commented 2 years ago

master就好了

DTStack / chunjun