Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores

Delta Lake 解决的是如何在对象存储上搭建性能不差的数据湖的问题。

这里有很多挑战，比如对象存储上 rename 等操作很昂贵，同时没有事务。

Delta Lake 的核心设计：we maintain information about which objects are part of a Delta table in an ACID manner, using a write-ahead log that is itself stored in the cloud object store. The objects themselves are encoded in Parquet, making it easy to write connectors from engines that can already process Parquet. This design allows clients to update multiple objects at once, replace a subset of the objects with another, etc., in a serializable manner while still achieving high parallel read and write performance from the objects themselves (similar to raw Parquet). The log also contains metadata such as min/max statistics for each data file, enabling order of magnitude faster metadata searches than the “files in object store” approach. Crucially, we designed Delta Lake so that all the metadata is in the underlying object store, and transactions are achieved using optimistic concurrency protocols against the object store (with some details varying by cloud provider). This means that no servers need to be running to maintain state for a Delta table; users only need to launch servers when running queries, and enjoy the benefits of separately scaling compute and storage.

dyweb / papers-notebook

Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores #229