scylladb / scylla-migrator

Migrate data extract using Spark to Scylla, normally from Cassandra
Apache License 2.0
54 stars 34 forks source link

Documentation, examples in README for how to import from S3 to Scylla #145

Open wpaven opened 1 month ago

wpaven commented 1 month ago

This capability was added in https://github.com/scylladb/scylla-migrator/issues/136 but what needs to be done to make it work? Can we have some documentation please? Sample config.yaml, sample spark-submit, any considerations for the S3 bucket, etc. would be very useful. And any limitations? Import only works with JSON exports correct?

julienrf commented 1 month ago

For now we have documentation in the file config.yaml.example:

https://github.com/scylladb/scylla-migrator/blob/ab59858bab9814190dd02afdecab92fcb372c421/config.yaml.example#L91-L122

I will publish the docs on the website soon (related: #116)

wpaven commented 1 month ago

Do you have a sample config.yaml that you can share with me until the docs are published please? I have questions, like is the bucket name the arn resource name for the bucket? Seeing an example of the formats for these attributes would be very helpful.

wpaven commented 1 month ago

thanks @julienrf for pointing out the config.yaml example in the test config https://github.com/scylladb/scylla-migrator/blob/master/tests/src/test/configurations/dynamodb-s3-export-to-alternator-basic.yaml