ray-project / deltacat

A portable Pythonic Data Catalog API powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.
Apache License 2.0
162 stars 23 forks source link

Avoid running bucketing twice #327

Closed raghumdani closed 4 months ago

raghumdani commented 4 months ago

Fix a bug where we generate pk hash column twice

Zyiqin-Miranda commented 4 months ago

Introduced by this commit?

raghumdani commented 4 months ago

Yes.

Zyiqin-Miranda commented 4 months ago

LGTM