scylladb / scylla-migrator

Migrate data extract using Spark to Scylla, normally from Cassandra/parquet files. Alt. from DynamoDB to Scylla Alternator.
https://migrator.docs.scylladb.com/stable/
Apache License 2.0
58 stars 35 forks source link

Improved debug on migrations #181

Closed pdbossman closed 1 month ago

pdbossman commented 2 months ago

There are different mandatory and optional input parameters to a scylla migration.

It would be useful at the beginning of migration to dump out input parameters and actual parameters used. For example, aws region for DynamoDB, scanSegments.

Right now, it is guess work what these parameters are and if they were visible, the diagnosis of problems would be much faster.

julienrf commented 1 month ago

I reviewed the way we use the configuration parameters and discovered that some of them are ineffective (e.g., setting scanSegments in the target does nothing, it works only if set to the source table).

Other than that, there are already logs that describe which parameters are used to configure the Hadoop job properties. See e.g. these logs or these logs.

@pdbossman would you consider this issue fixed with the following plan?

Or, is there anything else you need to improve debugging on migrations?

julienrf commented 1 month ago

See PR #199.