ray-project / deltacat

A portable Pythonic Data Catalog API powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.
Apache License 2.0
166 stars 23 forks source link

Fix divide by zero error when pyarrow table size comes out 0 #368

Closed raghumdani closed 3 weeks ago

raghumdani commented 4 weeks ago

Summary

This commit ensures the pyarrow table size 0 is handled appropriately.

Rationale

The size of the arrow table could be zero even if the file itself isn't 0.

Changes

N/A

Impact

Existing UTs pass. No impact,.

Testing

Added test cases to repro 0 arrow table size.

Regression Risk

None

Checklist

Additional Notes

Any additional information or context relevant to this PR.