scylladb / scylla-migrator

Migrate data extract using Spark to Scylla, normally from Cassandra
Apache License 2.0
54 stars 34 forks source link

The dependencies of the migrator may clash with the classpath of the Spark cluster it is run on #110

Open julienrf opened 4 months ago

julienrf commented 4 months ago

Currently, the migrator is designed to be run as a Spark job. As a consequence, any of its dependencies (embedded in its fat-jar) may clash with the content of the classpath in the Spark cluster that runs the job.

For instance, the migrator uses a specific version of the AWS SDK, which may clash with another version of the SDK that might be used on the Spark cluster.

More investigation is needed to assess whether this is a real problem or not. In case this is a real problem, a solution would be to shade the internal dependencies within the fat-jar of the migrator.