apache / gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
https://gravitino.apache.org
Apache License 2.0
1k stars 311 forks source link

[FEATURE] how to implement metadata search? #2172

Open mygrsun opened 8 months ago

mygrsun commented 8 months ago

Describe the feature

I want to know if gravitino has any plans to implement metadata search,for example table search. Gravitino use kv as the storage and has stored table metadata 。but I dont't know how to implement metadata search.kv storage don't supply the function of search。 Do I have to replace it with other storage? Are there any ideas for implementing table search?

Motivation

No response

Describe the solution

No response

Additional context

No response

qqqttt123 commented 8 months ago

Thanks for bringing this into discussion. Yes, we would like to support in the future, but this feature hasn't a schedule yet. This feature will have several implements. Netflix Metacat uses ElasiticSearch. Maybe we can store some indexes to implement it in kv storage and so on. Some surveys are necessary. If you have interest, you can propose a design document and finish this feature. It will be welcome if you have any thought.

YxAc commented 8 months ago

About metadata search, I think it is divided into two parts.

One part is used to provide a service by a data platform, such as listing the metadata under a user. This part can be read directly from the database through the read-write separation of RDS to obtain data consistency and low latency.

The other part is used to provide data retrieval and analysis, such as data discovery, data governance analysis, which can publish metadata (schema, business, and user-defined) to Elasticsearch for full-text search.

justinmclean commented 8 months ago

Elasticsearch is not under a license compatible with the Apache license. You might want to consider using OpenSeach instead.