apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.35k stars 928 forks source link

[Bug] Infinite loop after using online schema change to add a new column and update values #3450

Open chh3-chan opened 4 months ago

chh3-chan commented 4 months ago

Search before asking

Paimon version

0.7.0

Compute Engine

Flink 1.17.2

Minimal reproduce step

I do this all in the local environment, please let me know if i missed some key steps, many thanks!

  1. Docker build percona mysql 8.0 docker run -d --name ps -e MYSQL_ROOT_USER=root -e MYSQL_ROOT_PASSWORD=123456 -p 3306:3306 percona/percona-server:8.0
  2. Add sample database inventory and table customer_percona with values
  3. Install percona-toolkit 3.0.0
  4. Setup and Run Flink job bin/flink run lib/paimon-flink-action-0.7-SNAPSHOT.jar mysql_sync_database --warehouse hdfs://localhost:8020/paimon/warehouse --database inventory --mysql_conf hostname=127.0.0.1 --mysql_conf port=3306 --mysql_conf username=root --mysql_conf password=123456 --mysql_conf database-name=inventory --mysql_conf server-time-zone=UTC --catalog_conf metastore=filesystem --table_conf bucket=4 --table_conf changelog-producer=input --table_conf sink.parallelism=4 --type_mapping to-nullable
  5. Proceed online schema change to add new column pt-online-schema-change --alter "add column bag_fee double" h=127.0.0.1,p=123456,u=root,P=3306,D=inventory,t=customer_percona --execute
  6. Update value in the new created column UPDATE `inventory`.`customer_percona` SET `bag_fee` = '0' WHERE (`id` = '1');

What doesn't meet your expectations?

The writer keeps showing Busy max 100% The flink-root-taskexecutor-xxxx.log keep waiting for schema update. 2024-05-31 12:25:45,596 INFO org.apache.paimon.flink.sink.cdc.CdcRecordUtils [] - Field bag_fee not found. Waiting for schema update.


Before image (8)


Online schema change image


After updated value in the new created column image (9) image (10)

Anything else?

No response

Are you willing to submit a PR?

MOBIN-F commented 4 months ago

Using the pt-online-schema-change tool to perform an alter operation will generate temporary tables with suffixes of _new and _old. Paimon cannot handle this situation.