issues
search
ray-project
/
deltacat
A portable Pythonic Data Catalog API powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.
Apache License 2.0
166
stars
23
forks
source link
Added multi round splitting support
#339
Closed
akindu-amazon
closed
3 months ago
akindu-amazon
commented
3 months ago
Added multi round splitting support:
New num_round parameter in CompactPartitionParams (default is 1)
New helper function that groups uniform_deltas into batches if num_rounds is not 1
New pytest to test aggregation across multiple rounds (drop_duplicates = False)
Bug fix in read_delta_file_envelopes
Added multi round splitting support: