vesoft-inc / nebula-flink-connector

Flink Connector for Nebula Graph
48 stars 30 forks source link

table name design for Flink SQL Connector #70

Closed liuxiaocs7 closed 2 years ago

liuxiaocs7 commented 2 years ago

For the current implementation of flink sql nebula as sink(#57), I think the table created by flink sql is a temporary table now, the table name is default_catalog.default_database.table_name, and the table name in flink sql create statement is the same as the vertex/edge name in nebula? If that's true, how to create two tables in flink sql from different nebula graph spaces with same name?

For example, in the code bellow we can see, now the table name must equals to vertex name in nebula.

Shoule we add a parameter in with clause to increase flexibility?

At the same time, I noticed the listTables function in NebulaCatalog.java, comments are as follows: Tag and Edge, tag starts with VERTEX. and edge starts with EDGE., should we be compatible with this table name design if we want to use our own catalog?

liuxiaocs7 commented 2 years ago

Split discussion from #58.

spike-liu commented 2 years ago

+1

@Nicole00 what is your idea?

Nicole00 commented 2 years ago

Sorry for late reply.

  1. That's a great catch, and I agree with you @liuxiaocs7 .can we maintain the temporary table name and the real table name? (real table name has the space name as prefix)
  2. Maybe you can redefine the catalog to make them be compatible with the name in create statement.

SO GLAD to DISCUSS these details with you both. @spike-liu @liuxiaocs7

Nicole00 commented 2 years ago

Learn from hbase connector, we should add the nebula real table name param in create table statement.

     CREATE TABLE fTable(
          id STRING,
          prop1 STRING,
          prop2 STRING
          ) WITH (
         'connector' = ' nebula ',
         'table-name' = 'person',
         'graph-space' = 'flinkSink',
         'data-type' = 'vertex',
         xxx
          )

ftable is the table name used in flinksql, and person is the tag name in nebula.

And NebulaGraph has no default space, so graph-space should be reserved. What do you thank? @liuxiaocs7

liuxiaocs7 commented 2 years ago

Learn from hbase connector, we should add the nebula real table name param in create table statement.

  CREATE TABLE fTable(
       id STRING,
       prop1 STRING,
       prop2 STRING
       ) WITH (
      'connector' = ' nebula ',
      'table-name' = 'person',
      'graph-space' = 'flinkSink',
      'data-type' = 'vertex',
      xxx
       )

ftable is the table name used in flinksql, and person is the tag name in nebula.

And NebulaGraph has no default space, so graph-space should be reserved. What do you thank? @liuxiaocs7

I totally agree with you, add another param in with clause like 'table-name' = 'person' is a more common usage, then the table name in Flink SQL is separated from the tag/edge name in nebula.

graph-space should be reserved, i think so.

Thanks for your suggestion, @Nicole00, I will then refactor this logic~

liuxiaocs7 commented 2 years ago

Sorry for late reply.

  1. That's a great catch, and I agree with you @liuxiaocs7 .can we maintain the temporary table name and the real table name? (real table name has the space name as prefix)
  2. Maybe you can redefine the catalog to make them be compatible with the name in create statement.

SO GLAD to DISCUSS these details with you both. @spike-liu @liuxiaocs7

  1. In my opinion, real table name has the space name as prefix seems not a good idea, because ① that's means we have to create a tag/edge with space name, it is redundant. just like

    USE flinkSink
    CREATE TAG IF NOT EXISTS `flinkSink.person`

    ② when we to use to table, in fact there is a graph space name in with clause, but we have to declare one more time in table name. add table name seems more ok.

    1. After we have determined, we can modify the previous NebulaCatalog to maintain consistency.
Nicole00 commented 2 years ago

Sorry for late reply.

  1. That's a great catch, and I agree with you @liuxiaocs7 .can we maintain the temporary table name and the real table name? (real table name has the space name as prefix)
  2. Maybe you can redefine the catalog to make them be compatible with the name in create statement.

SO GLAD to DISCUSS these details with you both. @spike-liu @liuxiaocs7

  1. In my opinion, real table name has the space name as prefix seems not a good idea, because ① that's means we have to create a tag/edge with space name, it is redundant. just like
USE flinkSink
CREATE TAG IF NOT EXISTS `flinkSink.person`

② when we to use to table, in fact there is a graph space name in with clause, but we have to declare one more time in table name. add table name seems more ok.

  1. After we have determined, we can modify the previous NebulaCatalog to maintain consistency.

Sorry, my description is not clear. real table name has the space name as prefix means when we maintain the mapping of flink sql table and nebula real table, the real table should not just use tag/edge name, but with the space as prefix. eg:

create table ftable() with (table-name=person, space-name=test)

the mapping can be <ftable, test.person>