apache / shardingsphere

Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
Apache License 2.0
19.89k stars 6.73k forks source link

[New Feature] Implement MySQL/PostgreSQL/openGauss system database query #24662

Closed FlyingZC closed 1 year ago

FlyingZC commented 1 year ago

Background

Hi community. ShardingSphere is committed to providing a multi-source heterogeneous, enhanced database platform and further building an ecosystem around the upper layer of the platform. So ShardingSphere needs to be compatible with the inquiries of the system database of various types of databases, such as MySQL, PostgreSQL, and OpenGAUSS.

We are ready to be divided into two stages to implement this feature:

Phase 1: Mock all system database tables and view query, return to the empty result;

Phase 2: Adapt to the query of each table and view in turn, and return the correct result.

At this stage we only need to complete Phase one, the tasks are split as follows.

Tasks

mysql

The system schemas included in mysql are: information_schema, mysql, performance_schema, sys. information_schema schema has 61 tables; mysql schema has 31 tables; performance_schema has 87 tables; sys has 1 table and 100 views.

They are as follows:

information_schema

mysql

performance_schema

sys

tables

views

PostgreSQL

PostgreSQL contains two system schemas, information_schema and pg_catalog. information_schema has 4 tables and 65 views. pg_catalog has 62 tables and 72 views. They are as follows:

information_schema

tables

views

pg_catalog

tables

views

openGauss

openGauss has two system schemas, information_schema and pg_cataglog. information_schema has 7 tables, 59 views. pg_cataglog has 115 tables, 125 views. They are as follows:

information_schema

Tables

Views

pg_cataglog

Tables

Views

strongduanmu commented 1 year ago

That's great, let's make system database queries even better.

FlyingZC commented 1 year ago

The mock system table metadata example is as follows: https://github.com/apache/shardingsphere/pull/22057 https://github.com/apache/shardingsphere/pull/21512

dongzl commented 1 year ago

Hi @FlyingZC I want to do the mysql schema for MySQL database.

strongduanmu commented 1 year ago

@dongzl Welcome.

photon8231 commented 1 year ago

try to develop the mysql.plugin , thanks

sudip-unb commented 1 year ago

Hi @FlyingZC I am interested to work on PostgreSQL schema.

FlyingZC commented 1 year ago

@photon8231 @sudip-unb Welcome!

QiangYuan0828 commented 1 year ago

Hi, This is Qiang Yuan, a candidate for Gsoc2023. After learning from PR above, I would like to work on performance_schema of MySQL. But I want to make sure I understand the target correctly:

  1. create a .yaml file for each table under infra/common/src/main/resources/schema/mysql/performance_schema/.
  2. update enmu type in infra/common/src/main/java/org/apache/shardingsphere/infra/metadata/database/schema/builder/SystemSchemaBuilderRule.java.
  3. add test cases under infra/common/src/test/java/org/apache/shardingsphere/infra/metadata/database/schema/builder/.
  4. may need to write *.xml files under /test, but I am a little confused about where the location should be and how they will work for testing.
FlyingZC commented 1 year ago

@QiangYuan0828 Welcome!

You can refer to this pr https://github.com/apache/shardingsphere/pull/24860.

You can add the assertion file of the query result in this file test/e2e/sql/src/test/resources/cases/dql/dql-integration-select-system-schema.xml.

image

And the assertion file under test/resources/cases/dql/dataset/db/ like this:

image

You can use this naming convention to name assertion files select_${databaseType}_${schemaName}_${tableName}, like select_mysql_information_schema_innodb_sys_fields.xml.

QiangYuan0828 commented 1 year ago

@FlyingZC Got it, thank you so much!

wang-qingyang commented 1 year ago

Hi, This is wangqingyang, I would like to work on pg_catalog of PostgreSQL. After learning from PR above, I didn't see any related issues about Implement PostgreSQL system database query, so I am not sure if the work should be done like this:

  1. create a .yaml file for each table under infra/common/src/main/resources/schema/postgresql/pg_catalog/. like pg_aggregate.yaml

  2. update enmu type in infra/common/src/main/java/org/apache/shardingsphere/infra/metadata/database/schema/builder/SystemSchemaBuilderRule.java

    POSTGRESQL_PG_CATALOG("PostgreSQL", "pg_catalog", new HashSet<>(Arrays.asList("pg_class", "pg_database", "pg_inherits", "pg_tablespace", "pg_trigger", "pg_namespace"))),
  3. add test cases under infra/common/src/test/java/org/apache/shardingsphere/infra/metadata/database/schema/builder/SystemSchemaBuilderTest.java

       @Test
       void assertBuildForPostgreSQL() {
           Map<String, ShardingSphereSchema> actual = SystemSchemaBuilder.build("sharding_db", new PostgreSQLDatabaseType());
           assertThat(actual.size(), is(3));
           assertTrue(actual.containsKey("information_schema"));
           assertTrue(actual.containsKey("pg_catalog"));
           assertTrue(actual.containsKey("shardingsphere"));
           assertThat(actual.get("information_schema").getTables().size(), is(3));
           assertThat(actual.get("pg_catalog").getTables().size(), is(6));
           assertThat(actual.get("shardingsphere").getTables().size(), is(2));
       }

    infra/common/src/test/java/org/apache/shardingsphere/infra/metadata/database/schema/builder/SystemSchemaBuilderRuleTest.java

    @Test
    void assertValueOfSchemaPathSuccess() {
       SystemSchemaBuilderRule actualInformationSchema = SystemSchemaBuilderRule.valueOf(new MySQLDatabaseType().getType(), "information_schema");
       assertThat(actualInformationSchema, is(SystemSchemaBuilderRule.MYSQL_INFORMATION_SCHEMA));
       assertThat(actualInformationSchema.getTables().size(), is(61));
       SystemSchemaBuilderRule actualMySQLSchema = SystemSchemaBuilderRule.valueOf(new MySQLDatabaseType().getType(), "mysql");
       assertThat(actualMySQLSchema, is(SystemSchemaBuilderRule.MYSQL_MYSQL));
       assertThat(actualMySQLSchema.getTables().size(), is(31));
       SystemSchemaBuilderRule actualPerformanceSchema = SystemSchemaBuilderRule.valueOf(new MySQLDatabaseType().getType(), "performance_schema");
       assertThat(actualPerformanceSchema, is(SystemSchemaBuilderRule.MYSQL_PERFORMANCE_SCHEMA));
       assertThat(actualPerformanceSchema.getTables().size(), is(87));
    }
    1. add the assertion file of the query result in test/e2e/sql/src/test/resources/cases/dql/dataset/db/dql-integration-select-system-schema.xml
<test-case sql="SELECT * FROM pg_catalog.pg_aggregate" db-types="PostgreSQL" scenario-types="db">
    <assertion expected-data-file="select_postgresql_pg_catalog_pg_aggregate.xml" />
</test-case>

add assertion file under test/resources/cases/dql/dataset/db/ like this: select_postgresql_pg_catalog_pg_aggregate.xml

FlyingZC commented 1 year ago

@Wang-qing-yang Welcome! You are right to write code like this.

wang-qingyang commented 1 year ago

@FlyingZC hi, I have already finished the code for _pgaggregate in _pgcatalog for PostgreSQL two days ago , I wrote the code like what I mentioned before, but I couldn't pass maven check locally : mvn clean install -B -T2C -DskipTests -Dmaven.javadoc.skip=true -e. and the erro is : [ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.2.1:check (validate) on project shardingsphere-infra-common: Failed during checkstyle execution: There is 1 error reported by Checkstyle 9.3 with D:/git/shardingsphere/src/resources/checkstyle.xml ruleset. -> [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.2.1:check (validate) on project shardingsphere-infra-common: Failed during checkstyle execution at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute2 (MojoExecutor.java:347) at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:330) at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213) at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:175) at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:76) at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:163) at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39) at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:160) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105) at org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call (MultiThreadedBuilder.java:193) at org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call (MultiThreadedBuilder.java:180) at java.util.concurrent.FutureTask.run (FutureTask.java:264) at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:539) at java.util.concurrent.FutureTask.run (FutureTask.java:264) at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1136) at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:635) at java.lang.Thread.run (Thread.java:833) Caused by: org.apache.maven.plugin.MojoExecutionException: Failed during checkstyle execution at org.apache.maven.plugins.checkstyle.CheckstyleViolationCheckMojo.execute (CheckstyleViolationCheckMojo.java:554) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:126) at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute2 (MojoExecutor.java:342) at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:330) at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213) at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:175) at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:76) at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:163) at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39) at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:160) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105) at org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call (MultiThreadedBuilder.java:193) at org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call (MultiThreadedBuilder.java:180) at java.util.concurrent.FutureTask.run (FutureTask.java:264) at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:539) at java.util.concurrent.FutureTask.run (FutureTask.java:264) at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1136) at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:635) at java.lang.Thread.run (Thread.java:833) Caused by: org.apache.maven.plugins.checkstyle.exec.CheckstyleExecutorException: There is 1 error reported by Checkstyle 9.3 with D:/git/shardingsphere/src/resources/checkstyle.xml ruleset. at org.apache.maven.plugins.checkstyle.exec.DefaultCheckstyleExecutor.executeCheckstyle (DefaultCheckstyleExecutor.java:243) at org.apache.maven.plugins.checkstyle.CheckstyleViolationCheckMojo.execute (CheckstyleViolationCheckMojo.java:545) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:126) at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute2 (MojoExecutor.java:342) at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:330) at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213) at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:175) at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:76) at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:163) at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39) at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:160) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105) at org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call (MultiThreadedBuilder.java:193) at org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call (MultiThreadedBuilder.java:180) at java.util.concurrent.FutureTask.run (FutureTask.java:264) at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:539) at java.util.concurrent.FutureTask.run (FutureTask.java:264) at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1136) at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:635) at java.lang.Thread.run (Thread.java:833) and I tried to find why , but they all didn't work, I also cloned the original file, changed nothing but also could passed mvn clean install -B -T2C -DskipTests -Dmaven.javadoc.skip=true -e, and had the same error, that really confused me :(

and in files like _pgclass.yaml , in pg_class the oid column type should be oid from the document PostgreSQL 15 , and in this file : name: pg_class columns: oid: caseSensitive: true dataType: -5 generated: false name: oid primaryKey: false visible: true the dataType is -5 , but form the pg_type table in PostgreSQL 15, I think the dataType of oid should be 26: image

I would like to know if it is right to follow the pg_type table to confirm the dataType in .yaml files for PostgreSQL?

FlyingZC commented 1 year ago

@Wang-qing-yang Hi,you can refer to the logic oforg.apache.shardingsphere.infra.metadata.database.schema.loader.datatype.DataTypeLoader#load and org.apache.shardingsphere.infra.metadata.database.schema.loader.metadata.dialect.PostgreSQLSchemaMetaDataLoader#loadColumnMetaDataMap

QiangYuan0828 commented 1 year ago

Hi, I would like to try thesys schema in Mysql, but not sure if I should treat views as well as tabels in implementation. If there are lots of differents, can I get some references?

FlyingZC commented 1 year ago

Hi, I would like to try thesys schema in Mysql, but not sure if I should treat views as well as tabels in implementation. If there are lots of differents, can I get some references?

@QiangYuan0828 Welcome! Views are the same as tables.

FlyingZC commented 1 year ago

Closed for completion.