apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
7.79k stars 1.74k forks source link

[Improve][Connector-v2] Optimize the way of databases and tables are checked for existence #7261

Closed dailai closed 1 month ago

dailai commented 1 month ago

Purpose of this pull request

  1. Changed the way about all jdbc catalogs checked for the db or table exists. This is because the original method of checking whether a table exists is to simply list all the tables and then check whether all the table names contain the given table name, which can be a performance problem.

  2. In addition to the above issues, in mysql, it is not possible to create tables with the same table name in different case in default, which can lead to errors like the following image

Does this PR introduce any user-facing change?

How was this patch tested?

Check list

dailai commented 1 month ago
  1. Mysql mysql_catalog

  2. Oracle oracle_catalog

  3. Postgresql postgre_catalog

  4. Sqlserver sqlserver_catalog

In addition, org.apache.seatunnel.connectors.seatunnel.jdbc.AbstractJdbcIT#testCatalog will verify related e2e case.

hailin0 commented 1 month ago

Thanks,can you change all catalogs?

https://github.com/apache/seatunnel/tree/dev/seatunnel-connectors-v2/connector-jdbc/src/main/java/org/apache/seatunnel/connectors/seatunnel/jdbc/catalog

dailai commented 1 month ago

Thanks,can you change all catalogs?

https://github.com/apache/seatunnel/tree/dev/seatunnel-connectors-v2/connector-jdbc/src/main/java/org/apache/seatunnel/connectors/seatunnel/jdbc/catalog

done