apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
5.94k stars 2.08k forks source link

Support Iceberg Metadata storage in a variety of engines #3997

Open melin opened 2 years ago

melin commented 2 years ago

Most of Iceberg metadata is stored in the file system and is limited by NameNode performance. Storage engines such as RDBMS, Cassandra and mongodb can be supported through pluggable storage

flyrain commented 2 years ago

With the new RestAPI design, we should be able to use RDBMS or key-value store to replace the metadata.json files. The new APIs is WIP, the client will be provided in Iceberg repo, but user needs to implement the server side once the APIs are ready. However, I believe the open source server will be there eventually, it is probably another project. Other than metadata.json file, it needs a major overhaul to put manifest-list or manifest files into RDBMS/Key-value store. It is possible theoretically, but not sure it is the way people want to go.

melin commented 2 years ago

Bytedance has implemented Hudi MetaStore Server,https://cwiki.apache.org/confluence/display/HUDI/RFC-36%3A+HUDI+Metastore+Server

flyrain commented 2 years ago

Thanks for sharing. The Hudi metadata server makes sense generally. However, Iceberg doesn’t have the some of issues in Hudi, for example, file listing issue in Hudi metadata.

I list some benefits of an Iceberg metadata server.

  1. Multiple-table transaction
  2. Performance improvement
    1. Avoid send the full metadata json file from client to server
    2. Queuing multiple commits, other than each client resolves write-write conflict by themselves.
    3. Caching resources like JDBC connections
  3. Safer commits
    1. Clients with different versions can safely commit, without worrying about overwriting the newer properties
    2. Server side holds the truth of table format, upgrade to a newer version doesn’t need changes of all clients as long as API compatibility is kept.

There could be more benefits though.

References

melin commented 2 years ago

" file listing issue in Hudi metadata." => RFC - 15: HUDI Metadata Table and Cloud/DFS File Listing Improvements

bquinart commented 2 years ago

I think it would make sense to consider for example FoundationDB as storage layer for the metadata. That's what Snowflake and Firebolt currently use. Delta seems to also consider this (https://github.com/delta-io/delta/issues/867). Of course it can be a another similar transactional, highly-available, scalable and low latency store (if it exists). Decoupling the metadata from the actual storage would open a lot of possible new use cases. In particular evolving Iceberg to the storage layer for "modern cloud datawarehouses".