ray-project / deltacat

A portable Pythonic Data Catalog API powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.
Apache License 2.0
166 stars 23 forks source link

refactored compaction_session.py #331

Closed akindu-amazon closed 4 months ago

akindu-amazon commented 4 months ago

Refactored compaction_session.py with more modular functions that are called within _execute_compaction. These functions are:

These functions will allow for easier support for multiple rounds for large tables, while previously compactable tables are compacted the same (all deltacat pytest tests pass).