Open caican00 opened 7 months ago
Hi @FANNG1 , i made a draft plan about how to support iceberg rest service in spark-connector, could you please help review it if you are free? Thank you very much.
I made a draft plan about how to support iceberg rest service in spark-connector, please kindly review it if you are free. Thank you very much. cc @FANNG1
cc @coolderli
@FANNG1 Can you please take a look?
@caican00 thanks for proposing this, totally I think Iceberg REST catalog is just one of catalogs of Iceberg, similar to HiveCatalog or JDBC catalog, so I prefer to add a new catalog backend called rest
, with rest catalog server address as uri. for hive as rest catalog backend, we could provide something like backend-catalog.uri
to distinguish the current uri
, WDYT?
@caican00 thanks for proposing this, totally I think Iceberg REST catalog is just one of catalogs of Iceberg, similar to HiveCatalog or JDBC catalog, so I prefer to add a new catalog backend called
rest
, with rest catalog server address as uri. for hive as rest catalog backend, we could provide something likebackend-catalog.uri
to distinguish the currenturi
, WDYT?
@FANNG1 i am sorry for taking so long to reply. I have some doubts about this plan:
rest
, in server side, should we instantiate a RestCatalog
instance?
RestCatalog
instance in server side, how do we set up the real backend? such as hms. Because we should use the real backend catalog to interacts with backend storage in rest server
.
- if add a new catalog backend called
rest
, in server side, should we instantiate aRestCatalog
instance?
yes, RestCatalog
actually implements REST client.
- if we instantiate a
RestCatalog
instance in server side, how do we set up the real backend? such as hms. Because we should use the real backend catalog to interacts with backend storage inrest server
.
I think this is the responsibility of the Iceberg REST catalog server, not the Gravitino Iceberg catalog.
- if add a new catalog backend called
rest
, in server side, should we instantiate aRestCatalog
instance?yes,
RestCatalog
actually implements REST client.
- if we instantiate a
RestCatalog
instance in server side, how do we set up the real backend? such as hms. Because we should use the real backend catalog to interacts with backend storage inrest server
.I think this is the responsibility of the Iceberg REST catalog server, not the Gravitino Iceberg catalog.
I think it's okay.
Describe the subtask
support iceberg rest catalog in spark-connector.
lakehouse-iceberg
catalog is created without registering therest service uri
The spark-connector cannot get the
rest service uri
from the loaded catalog propertiesI think that we should support registering
rest service uri
when creating a lakehouse-iceberg catalog and i have a draft plan here:enable-rest-service
is checked on the ui, the icebergrest service uri
must be specifiedWe may not need to consider supporting
rest catalog-backend
in the early stage:rest service uri
is required, regardless of which catalog-backend is chosen.com.datastrato.gravitino.catalog.lakehouse.iceberg.IcebergCatalogPropertiesMetadata
com.datastrato.gravitino.catalog.lakehouse.iceberg.IcebergConfig
spark-connector get the rest service uri directly from the iceberg catalog properties without manual configuration
Parent issue
https://github.com/datastrato/gravitino/issues/1571