apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.12k stars 839 forks source link

[cdc]Fix performance issue in CanalRecordParser #3572

Closed MOBIN-F closed 1 week ago

MOBIN-F commented 1 week ago

Purpose

Linked issue: close #3571 Optimize the following issues:

  1. toPaimonFieldTypes and recordMap.entrySet() can share the same for loop to reduce the number of traversals
  2. There are repeated string parsing in getShortType, getPrecision, and getScale in toDataType(String mysqlFullType, TypeMapping typeMapping) and transformValue(@Nullable String oldValue, String mySqlType), especially repeated calls to the getShortTyp method
  3. Refactored the parsing logic of shortType, length, and scale of mysqlFullType

Tests

API and Format

Documentation

MOBIN-F commented 1 week ago

can you help review this pr? tks @JingsongLi @yuzelin

yuzelin commented 1 week ago

+1