Use deletion vectors to track deleted rows in data files
Why are you requesting this feature?
Currently, our system uses a copy-on-write approach for UPDATE and DELETE operations, where entire data files are replaced even if only a single row is deleted. Implementing deletion vectors would be a more efficient solution
What is your proposed implementation for this feature?
When rows are deleted from a data file, a deletion vector is created as a bitmap that indicates whether each row within this data file is deleted or not
This deletion vector is stored separately, either in its own file or within a Postgres heap table
During a columnstore table scan, the deletion vector is used to skip over deleted rows
If a data file accumulates too many deletions, a new data file containing only the undeleted rows will be created
What feature are you requesting?
Use deletion vectors to track deleted rows in data files
Why are you requesting this feature?
Currently, our system uses a copy-on-write approach for UPDATE and DELETE operations, where entire data files are replaced even if only a single row is deleted. Implementing deletion vectors would be a more efficient solution
What is your proposed implementation for this feature?
When rows are deleted from a data file, a deletion vector is created as a bitmap that indicates whether each row within this data file is deleted or not This deletion vector is stored separately, either in its own file or within a Postgres heap table During a columnstore table scan, the deletion vector is used to skip over deleted rows If a data file accumulates too many deletions, a new data file containing only the undeleted rows will be created