Open FineAndDandy opened 9 months ago
I can take a look at this
I'm not so sure we should do this. This would require a new API, so it can't be done in 2.1 where it'd be most useful. Users already have the ability to clone and efficiently truncate a table. That efficiency is limited in 2.1, due to chop compactions, which go away in 3.1. In 3.1, it'd probably be better to implement support for allowing range deletion to occur on an offline table, since it doesn't need to be online for chop compactions. That would support an offline truncate, for the situations where users don't want to bring the table online and host it in order to perform the operation. For the elasticity branch, I believe the truncate operation can already happen on an unhosted table, so it's not needed there.
Is your feature request related to a problem? Please describe. Cloning a table is overkill if a small subset of the table is all that needs to be cloned. If the need to clone is for a small fragment of data it produces a large amount of GC overhead that is not necessary when cleaning up the clone.
Describe the solution you'd like Adding an optional range to the clone table operation would allow a subset of the table to be cloned. This would limit the GC overhead to be only the files relevant to the clone.