scylladb / scylla-migrator

Migrate data extract using Spark to Scylla, normally from Cassandra
Apache License 2.0
54 stars 34 forks source link

Add Feature to Migrate Partial Table Data Based on Filter Criteria #102

Open ricardoborenstein opened 7 months ago

ricardoborenstein commented 7 months ago

I propose adding a feature to the Scylla Migrator that allows users to migrate only a portion of a table, based on specific filter criteria applied to the source table. This feature would enable more flexible and efficient data migrations, particularly useful in scenarios where only a subset of data needs to be transferred.

The feature would introduce the ability to specify filter criteria in the source settings configuration. This filter would then be applied when querying the source table, allowing only the data that matches the criteria to be migrated.

julienrf commented 1 week ago

Hello @ricardoborenstein. We do support a where property in the configuration when migrating from a CQL-compatible source:

https://github.com/scylladb/scylla-migrator/blob/95826a28b77de4870f91a2528bb1e2f2ba3cf7f5/config.yaml.example#L44-L45

Does that address your need? Do you need something similar for DynamoDB-compatible sources?