Mooncake-Labs / pg_mooncake

Iceberg/Delta Columnstore Table in Postgres
http://mooncake.dev
MIT License
228 stars 12 forks source link

Support deletion vectors #30

Open dpxcc opened 1 week ago

dpxcc commented 1 week ago

What feature are you requesting?

Use deletion vectors to track deleted rows in data files

Why are you requesting this feature?

Currently, our system uses a copy-on-write approach for UPDATE and DELETE operations, where entire data files are replaced even if only a single row is deleted. Implementing deletion vectors would be a more efficient solution

What is your proposed implementation for this feature?

When rows are deleted from a data file, a deletion vector is created as a bitmap that indicates whether each row within this data file is deleted or not This deletion vector is stored separately, either in its own file or within a Postgres heap table During a columnstore table scan, the deletion vector is used to skip over deleted rows If a data file accumulates too many deletions, a new data file containing only the undeleted rows will be created