delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
2.02k stars 365 forks source link

Support for cloning meta-operations (shallow clones) #2456

Open mjclarke94 opened 2 months ago

mjclarke94 commented 2 months ago

Description

Add support for creation/management of shallow clones (feature since 2.3) via delta-rs with python bindings.

Use Case Shallow clones are very valuable when wanting to test new features in ephemeral environments against production data, without huge memory usage or disruption to production systems. Being able to use a one-liner to effectively create an isolated test environment is especially valuable where users are granted read-only access to the table, but can use this feature to cheaply create their own writable branch of the data for testing new features.

xbrianh commented 2 months ago

We also have use cases for shallow clones.

ion-elgreco commented 2 months ago

You can achieve this also with LakeFS, much better imho for such usecases

mjclarke94 commented 2 months ago

Yeah, it looks really powerful. My rational for wanting this functionality here is that adding the command here requires zero extra infrastructure to achieve this.

I don't disagree LakeFS looks better, it's just more overhead to use it!

ion-elgreco commented 2 months ago

Yeah, it looks really powerful. My rational for wanting this functionality here is that adding the command here requires zero extra infrastructure to achieve this.

I don't disagree LakeFS looks better, it's just more overhead to use it!

True :), but this wouldn't be high on my priority list to add