apache / gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
https://gravitino.apache.org
Apache License 2.0
1.09k stars 343 forks source link

How to connect to our existing remote HDFS and obtain metadata #5374

Closed yu286063991 closed 2 weeks ago

yu286063991 commented 3 weeks ago

We attempted to connect to HDFS by entering IP and port in the location parameter, but were unable to retrieve catalogs and filesets, and there were no error messages in Gravitino. How should we configure it to successfully retrieve metadata? The following image shows our configuration information, but we are unable to obtain HDFS metadata information through this configuration: hdfs

yu286063991 commented 3 weeks ago

We conducted testing based on version 0.6.1-incubating

FANNG1 commented 3 weeks ago

Fileset is not to manage HDFS metadata, it's used to manage a mapping between the logic directory and the physical directory.

yu286063991 commented 3 weeks ago

Fileset is not to manage HDFS metadata, it's used to manage a mapping between the logic directory and the physical directory.

Thank you for your reply. Do we need to create a mapping relationship between HDFS directories and local directories through the Fileset API instead of directly querying HDFS metadata

FANNG1 commented 3 weeks ago

Fileset is not to manage HDFS metadata, it's used to manage a mapping between the logic directory and the physical directory.

Thank you for your reply. Do we need to create a mapping relationship between HDFS directories and local directories through the Fileset API instead of directly querying HDFS metadata

sorry, I couldn't get your point. Generally, you need to create a fileset that maps an HDFS directory, and you could read and write the HDFS data by using gvfs://xxx not hdfs://xx.

yu286063991 commented 3 weeks ago

Fileset is not to manage HDFS metadata, it's used to manage a mapping between the logic directory and the physical directory.

Thank you for your reply. Do we need to create a mapping relationship between HDFS directories and local directories through the Fileset API instead of directly querying HDFS metadata

sorry, I couldn't get your point. Generally, you need to create a fileset that maps an HDFS directory, and you could read and write the HDFS data by using gvfs://xxx not hdfs://xx.

Does creating a mapping refer to calling API to create a fileset corresponding to the HDFS directory. For example, the following API: http://localhost:8090/api/metalakes/:metalake/catalogs/:catalog/schemas/:schema/filesets

yuqi1129 commented 3 weeks ago

Fileset is not to manage HDFS metadata, it's used to manage a mapping between the logic directory and the physical directory.

Thank you for your reply. Do we need to create a mapping relationship between HDFS directories and local directories through the Fileset API instead of directly querying HDFS metadata

sorry, I couldn't get your point. Generally, you need to create a fileset that maps an HDFS directory, and you could read and write the HDFS data by using gvfs://xxx not hdfs://xx.

Does creating a mapping refer to calling API to create a fileset corresponding to the HDFS directory. For example, the following API: http://localhost:8090/api/metalakes/:metalake/catalogs/:catalog/schemas/:schema/filesets

Exactly, a fileset can map a logical directory to a physical position and ignore the actual implementation. So, I wonder what's the problem you 're encountering.

yu286063991 commented 3 weeks ago

Fileset is not to manage HDFS metadata, it's used to manage a mapping between the logic directory and the physical directory.

Thank you for your reply. Do we need to create a mapping relationship between HDFS directories and local directories through the Fileset API instead of directly querying HDFS metadata

sorry, I couldn't get your point. Generally, you need to create a fileset that maps an HDFS directory, and you could read and write the HDFS data by using gvfs://xxx not hdfs://xx.

Does creating a mapping refer to calling API to create a fileset corresponding to the HDFS directory. For example, the following API: http://localhost:8090/api/metalakes/:metalake/catalogs/:catalog/schemas/:schema/filesets

Exactly, a fileset can map a logical directory to a physical position and ignore the actual implementation. So, I wonder what's the problem you 're encountering.

The problem has been resolved. Thank you for your support