apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
7.59k stars 1.68k forks source link

[Feature][Transform] Add split multiple source table in transform #6834

Open liunaijie opened 1 month ago

liunaijie commented 1 month ago

Search before asking

Description

Now we supported read multiple table in source, and the result_table_name can only be one table. so the transform/sink can't only choose one table that needed. we need implement the split function to split the source multile tables.

One user case is: read multiple table in source, and we need do some different table transform with different table, and write it to one/multiple sinks.

the example config is look like this:


source  {
  read_tables = [tableA, tableB, tableC]
}

transform {
   Sql {
       source_table_name = "tableA"
        result_table_name = "tableA_sql"
       query = "select xxxx from tableA"
   }

   Sql {
       source_table_name = "tableB"
        result_table_name = "tableB_sql"
       query = "select xxxx from tableB"
   }

   Sql {
       source_table_name = "tableC"
        result_table_name = "tableC_sql"
       query = "select xxxx from tableC"
   }

}

sink {

  LocalFile {
      source_table_name = "tableA_sql"
  }

  MySQL {
      source_table_name = "tableB_sql"
  }

  Hive {
      source_table_name = ["tableA_sql", "tableB_sql", "tableC_sql"]
  }

}

Usage Scenario

No response

Related issues

No response

Are you willing to submit a PR?

Code of Conduct

EricJoy2048 commented 1 month ago

Here is a pr for this feature https://github.com/apache/seatunnel/pull/5646. But the contributor is busy with other work and can not do it continue.