ray-project / deltacat

A portable Pythonic Data Catalog API powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.
Apache License 2.0
166 stars 23 forks source link

Adding a test to assert RCF values are calculated correctly #349

Closed raghumdani closed 2 months ago

raghumdani commented 2 months ago

This PR ensures the RCF stats (inputInflation and averageRecordSizeBytes) are calculated based on the incremental data only.