GoogleCloudPlatform / DataflowTemplates

Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
https://cloud.google.com/dataflow/docs/guides/templates/provided-templates
Apache License 2.0
1.14k stars 953 forks source link

[Bug]: An interruption occur in snapshot stage when using cdc-embedded-connector module to transfer a big mysql table(10M) #519

Closed blackmidnight closed 3 months ago

blackmidnight commented 1 year ago

Related Template(s)

cdc-embedded-connector

What happened?

We use cdc-embedded-connector to transfer a mysql table to gcp pubsub, At the beginning, this table only had hundreds of thousands of rows of records. The program normally executed the complete process from snapshot to binlog, but when the table increased to 10 million, we found that during the snapshot stage when scanning to There was an interruption at 4.85 million, and I tried many times. The number of rows scanned each time was different, there were 720,000, 1.76 million, etc., and the time for interruption was also different. There were 15 minutes, 30 minutes, etc. Check that there is no task exception information in the log. We checked the source code and debugged but couldn't find the reason. Whether it is the timeout of the mysql server or the OOM of the client, there should be corresponding error logs, which makes us puzzled. Our environment is: mysql: azure mysql 5.7 k8s: 1.21.14-gke.3000 offset storage: file The relevant log screenshot is as follows: image image

Beam Version

Newer than 2.35.0

Relevant log output

2022-11-24 10:25:11 [debezium-mysqlconnector-appwheel-snapshot] INFO io.debezium.connector.mysql.SnapshotReader - Step 8: - 1750000 of 9678166 rows scanned from table 'appwheel.aw_order_log' after 00:15:51.159
2022-11-24 10:25:13 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Embedded engine is polling task for records on thread Thread[pool-3-thread-1,5,main]
2022-11-24 10:25:13 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Embedded engine returned from polling task for records
2022-11-24 10:25:13 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Received 2048 records from the task
2022-11-24 10:25:14 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Embedded engine is polling task for records on thread Thread[pool-3-thread-1,5,main]
2022-11-24 10:25:14 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Embedded engine returned from polling task for records
2022-11-24 10:25:14 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Received 2048 records from the task
2022-11-24 10:25:16 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Embedded engine is polling task for records on thread Thread[pool-3-thread-1,5,main]
2022-11-24 10:25:16 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Embedded engine returned from polling task for records
2022-11-24 10:25:16 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Received 2048 records from the task
2022-11-24 10:25:17 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Embedded engine is polling task for records on thread Thread[pool-3-thread-1,5,main]
2022-11-24 10:25:17 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Embedded engine returned from polling task for records
2022-11-24 10:25:17 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Received 2048 records from the task
2022-11-24 10:25:19 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Embedded engine is polling task for records on thread Thread[pool-3-thread-1,5,main]
2022-11-24 10:25:19 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Embedded engine returned from polling task for records
2022-11-24 10:25:19 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Received 2048 records from the task
2022-11-24 10:25:19 [debezium-mysqlconnector-appwheel-snapshot] INFO io.debezium.connector.mysql.SnapshotReader - Step 8: - 1760000 of 9678166 rows scanned from table 'appwheel.aw_order_log' after 00:15:58.549
2022-11-24 10:25:20 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Embedded engine is polling task for records on thread Thread[pool-3-thread-1,5,main]
2022-11-24 10:25:20 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Embedded engine returned from polling task for records
2022-11-24 10:25:20 [pool-3-thread-1] DEBUG io.debezium.embedded.EmbeddedEngine - Received 2048 records from the task
2022-11-24 10:25:34 [main] INFO com.google.cloud.dataflow.cdc.connector.DebeziumToPubSubDataSender - Waiting another 30 seconds for the embedded engine to shut down
2022-11-24 10:26:04 [main] INFO com.google.cloud.dataflow.cdc.connector.DebeziumToPubSubDataSender - Waiting another 30 seconds for the embedded engine to shut down
2022-11-24 10:26:34 [main] INFO com.google.cloud.dataflow.cdc.connector.DebeziumToPubSubDataSender - Waiting another 30 seconds for the embedded engine to shut down
github-actions[bot] commented 4 months ago

This issue has been marked as stale due to 180 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the issue at any time. Thank you for your contributions.

github-actions[bot] commented 3 months ago

This issue has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.