Is your feature request related to a problem? Please describe.
I have scenario come up frequently on various clusters. You have a table that has billions of rows and you need to update say 100 million rows. There is no way to easily do this with Cockroach DB. You can't really do this as one statement and need to create some code to batch up the delete.
It would be real useful if this sort of activity was managed internally by Cockroach DB using a syntax such as:
UPDATE activity_log
SET is_active=false
WHERE country_id = <$1>
USING BATCHES OF 1000;
... or ...
DELETE FROM activity_log
WHERE is_active=false
USING BATCHES OF 1000;
This would just run in batches as a job and not have to be transactionally consistent across the whole operation. This would allow developers and database operations to not have to manually script these operations.
Describe the solution you'd like
Have the database handle these operations as a batch job.
Describe alternatives you've considered
I have created a description of this for people troubleshooting this behavior. Additionally, I have created a sample trimmer tool to do parallel bulk deletion of data. This tool parallelizes the DELETE into threads defined by the histogram statistics for the table.
Is your feature request related to a problem? Please describe.
I have scenario come up frequently on various clusters. You have a table that has billions of rows and you need to update say 100 million rows. There is no way to easily do this with Cockroach DB. You can't really do this as one statement and need to create some code to batch up the delete.
It would be real useful if this sort of activity was managed internally by Cockroach DB using a syntax such as:
... or ...
This would just run in batches as a job and not have to be transactionally consistent across the whole operation. This would allow developers and database operations to not have to manually script these operations.
Describe the solution you'd like Have the database handle these operations as a batch job.
Describe alternatives you've considered I have created a description of this for people troubleshooting this behavior. Additionally, I have created a sample trimmer tool to do parallel bulk deletion of data. This tool parallelizes the
DELETE
into threads defined by the histogram statistics for the table.Jira issue: CRDB-18517