apache / gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
https://gravitino.apache.org
Apache License 2.0
921 stars 297 forks source link

[Bug report] Catalog backend name handle the renaming of catalog #4718

Open jerqi opened 2 weeks ago

jerqi commented 2 weeks ago

Version

main branch

Describe what's wrong

Currenlty, Iceberg jdbc backend will use catalog name as default backend name. In my understanding, it will trigger the bug if I rename the catalog.

Error message and/or stacktrace

From the code.

catalogname
    String catalogName = properties.get("catalog-name");
    jdbcProperties.put(
        "iceberg.jdbc-catalog.catalog-name",
        properties.getOrDefault(IcebergConstants.CATALOG_BACKEND_NAME, catalogName));

How to reproduce

I have questions when I read the code

Additional context

No response

jerqi commented 2 weeks ago

@FANNG1 Could you take a look?

FANNG1 commented 2 weeks ago

thanks for reporting, I'll take it

FANNG1 commented 2 weeks ago
  1. The default value of CATALOG_BACKEND_NAME should be set by Gravitino server, the clients(Spark, Trino, Flink) should not care about it.
  2. Compared to catalog id , I'd like to use origin jdbc as the default catalog name, as Iceberg REST server doesn't have something like catalog id.

@diqiu50 @jerqi @jerryshao WDYT?

diqiu50 commented 2 weeks ago

What are the problems with using jdbc as the default catalog name?

FANNG1 commented 2 weeks ago

What are the problems with using jdbc as the default catalog name?

The two catalogs with the same JDBC URI will share the same table.

diqiu50 commented 2 weeks ago

It feels like this is not a reasonable usage.

diqiu50 commented 2 weeks ago

If the user wants to use one MySQL, at least the database name for Iceberg catalogs should not be the same

FANNG1 commented 2 weeks ago

database is included in the uri for JDBC backend.

diqiu50 commented 2 weeks ago

Why do two Iceberg catalogs need to use the same database? Do users expect these two catalogs to be able to see each other's tables?

FANNG1 commented 2 weeks ago

Why do two Iceberg catalogs need to use the same database?

Maybe used for test? I'm not sure

Do users expect these two catalogs to be able to see each other's tables?

I'm not sure, What do you think?

diqiu50 commented 2 weeks ago

I think it maybe used for test.

FANNG1 commented 2 weeks ago

@diqiu50 , could you summarize your point about the proposed changes?

diqiu50 commented 1 week ago

I think that if the user sets the catalog.backend.name, it should use the user's setting; otherwise, it should use the default