scylladb / scylla-migrator

Migrate data extract using Spark to Scylla, normally from Cassandra
Apache License 2.0
54 stars 34 forks source link

DynamoDB migration is unable to read credentials. #122

Open pdbossman opened 3 months ago

pdbossman commented 3 months ago

Attempted to migrate from DynamoDB

I ran aws configure, and from the master and workers, I am able to list DynamoDB tables: Source dynamodb: aws dynamodb list-tables { "TableNames": [ "monitoring", "redacted-table-name-here", "tfstate-locks" ] }

target scylla (I have a /etc/hosts assigning scylla hostname to proper ip): aws dynamodb list-tables --endpoint-url "http://scylla:8000" { "TableNames": [ "redacted-table-name-here" ] }

When I run spark-submit, it's hung looking for security credentials.

Spark Executor Command: "/usr/lib/jvm/java-8-openjdk-amd64/bin/java" "-cp" "/opt/spark/conf/:/opt/spark/jars/*" "-Xmx1024M" "-Dspark.driver.port=34107" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@spark-master:34107" "--executor-id" "0" "--hostname" "172.31.19.213" "--cores" "7" "--app-id" "app-20240329192206-0000" "--worker-url" "spark://Worker@172.31.19.213:42357"
========================================

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
24/03/29 19:22:06 INFO CoarseGrainedExecutorBackend: Started daemon with process name: 10218@ip-172-31-19-213
24/03/29 19:22:06 INFO SignalUtils: Registered signal handler for TERM
24/03/29 19:22:06 INFO SignalUtils: Registered signal handler for HUP
24/03/29 19:22:06 INFO SignalUtils: Registered signal handler for INT
24/03/29 19:22:06 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/03/29 19:22:06 INFO SecurityManager: Changing view acls to: ubuntu
24/03/29 19:22:06 INFO SecurityManager: Changing modify acls to: ubuntu
24/03/29 19:22:06 INFO SecurityManager: Changing view acls groups to: 
24/03/29 19:22:06 INFO SecurityManager: Changing modify acls groups to: 
24/03/29 19:22:06 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(ubuntu); groups with view permissions: Set(); users  with modify permissions: Set(ubuntu); groups with modify permissions: Set()
24/03/29 19:22:07 INFO TransportClientFactory: Successfully created connection to spark-master/172.31.19.213:34107 after 59 ms (0 ms spent in bootstraps)
24/03/29 19:22:07 INFO SecurityManager: Changing view acls to: ubuntu
24/03/29 19:22:07 INFO SecurityManager: Changing modify acls to: ubuntu
24/03/29 19:22:07 INFO SecurityManager: Changing view acls groups to: 
24/03/29 19:22:07 INFO SecurityManager: Changing modify acls groups to: 
24/03/29 19:22:07 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(ubuntu); groups with view permissions: Set(); users  with modify permissions: Set(ubuntu); groups with modify permissions: Set()
24/03/29 19:22:07 INFO TransportClientFactory: Successfully created connection to spark-master/172.31.19.213:34107 after 1 ms (0 ms spent in bootstraps)
24/03/29 19:22:07 INFO DiskBlockManager: Created local directory at /tmp/spark-0926f7f1-5429-4977-9039-b27ef29e9fc1/executor-d75a5ba6-9ad2-40fd-80c5-4b5e015cd6c5/blockmgr-c42744cc-9d5b-4850-bf94-24ebf5b5fa4e
24/03/29 19:22:07 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
24/03/29 19:22:07 INFO CoarseGrainedExecutorBackend: Connecting to driver: spark://CoarseGrainedScheduler@spark-master:34107
24/03/29 19:22:07 INFO WorkerWatcher: Connecting to worker spark://Worker@172.31.19.213:42357
24/03/29 19:22:07 INFO TransportClientFactory: Successfully created connection to /172.31.19.213:42357 after 1 ms (0 ms spent in bootstraps)
24/03/29 19:22:07 INFO WorkerWatcher: Successfully connected to spark://Worker@172.31.19.213:42357
24/03/29 19:22:07 INFO CoarseGrainedExecutorBackend: Successfully registered with driver
24/03/29 19:22:07 INFO Executor: Starting executor ID 0 on host 172.31.19.213
24/03/29 19:22:07 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 36957.
24/03/29 19:22:07 INFO NettyBlockTransferService: Server created on 172.31.19.213:36957
24/03/29 19:22:07 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
24/03/29 19:22:07 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(0, 172.31.19.213, 36957, None)
24/03/29 19:22:07 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(0, 172.31.19.213, 36957, None)
24/03/29 19:22:07 INFO BlockManager: Initialized BlockManager: BlockManagerId(0, 172.31.19.213, 36957, None)
24/03/29 19:22:08 INFO CoarseGrainedExecutorBackend: Got assigned task 0
24/03/29 19:22:08 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
24/03/29 19:22:08 INFO Executor: Fetching spark://spark-master:34107/jars/scylla-migrator-assembly-0.0.1.jar with timestamp 1711740125874
24/03/29 19:22:08 INFO TransportClientFactory: Successfully created connection to spark-master/172.31.19.213:34107 after 1 ms (0 ms spent in bootstraps)
24/03/29 19:22:08 INFO Utils: Fetching spark://spark-master:34107/jars/scylla-migrator-assembly-0.0.1.jar to /tmp/spark-0926f7f1-5429-4977-9039-b27ef29e9fc1/executor-d75a5ba6-9ad2-40fd-80c5-4b5e015cd6c5/spark-f0a6e49c-e29a-4084-8870-2d8bff172345/fetchFileTemp1824647902647891514.tmp
24/03/29 19:22:08 INFO Utils: Copying /tmp/spark-0926f7f1-5429-4977-9039-b27ef29e9fc1/executor-d75a5ba6-9ad2-40fd-80c5-4b5e015cd6c5/spark-f0a6e49c-e29a-4084-8870-2d8bff172345/18937338341711740125874_cache to /opt/spark/work/app-20240329192206-0000/0/./scylla-migrator-assembly-0.0.1.jar
24/03/29 19:22:08 INFO Executor: Adding file:/opt/spark/work/app-20240329192206-0000/0/./scylla-migrator-assembly-0.0.1.jar to class loader
24/03/29 19:22:08 INFO TorrentBroadcast: Started reading broadcast variable 1
24/03/29 19:22:08 INFO TransportClientFactory: Successfully created connection to spark-master/172.31.19.213:39917 after 1 ms (0 ms spent in bootstraps)
24/03/29 19:22:08 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 26.0 KB, free 366.3 MB)
24/03/29 19:22:08 INFO TorrentBroadcast: Reading broadcast variable 1 took 53 ms
24/03/29 19:22:09 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 72.1 KB, free 366.2 MB)
24/03/29 19:22:09 INFO HadoopRDD: Input split: org.apache.hadoop.dynamodb.split.DynamoDBSegmentsSplit@535476a9
24/03/29 19:22:09 INFO TorrentBroadcast: Started reading broadcast variable 0
24/03/29 19:22:09 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 23.2 KB, free 366.2 MB)
24/03/29 19:22:09 INFO TorrentBroadcast: Reading broadcast variable 0 took 6 ms
24/03/29 19:22:09 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 323.3 KB, free 365.9 MB)
24/03/29 19:22:09 INFO DynamoDBUtil: Using endpoint for DynamoDB: dynamodb.us-east-1.amazonaws.com
24/03/29 19:22:09 INFO deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
24/03/29 19:22:09 INFO JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
24/03/29 19:22:09 INFO ReadIopsCalculator: Table name: redacted-table-name-here
24/03/29 19:22:09 INFO ReadIopsCalculator: Throughput percent: 0.5
24/03/29 19:22:09 WARN DynamoDBFibonacciRetryer: Retry: 1 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:22:09 WARN DynamoDBFibonacciRetryer: Retry: 2 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:22:10 WARN DynamoDBFibonacciRetryer: Retry: 3 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:22:10 WARN DynamoDBFibonacciRetryer: Retry: 4 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:22:10 WARN DynamoDBFibonacciRetryer: Retry: 5 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:22:11 WARN DynamoDBFibonacciRetryer: Retry: 6 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:22:12 WARN DynamoDBFibonacciRetryer: Retry: 7 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:22:14 WARN DynamoDBFibonacciRetryer: Retry: 8 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:22:15 WARN DynamoDBFibonacciRetryer: Retry: 9 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:22:18 WARN DynamoDBFibonacciRetryer: Retry: 10 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:22:23 WARN DynamoDBFibonacciRetryer: Retry: 11 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:22:35 WARN DynamoDBFibonacciRetryer: Retry: 12 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:22:58 WARN DynamoDBFibonacciRetryer: Retry: 13 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:23:30 WARN DynamoDBFibonacciRetryer: Retry: 14 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:24:15 WARN DynamoDBFibonacciRetryer: Retry: 15 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]
24/03/29 19:25:24 WARN DynamoDBFibonacciRetryer: Retry: 16 Exception: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.InstanceProfileCredentialsProvider@3b9afbc3: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]

@hopugop @tarzanek @erezvelan

pdbossman commented 3 months ago

FYI - my credentials are set in: ~/.aws/credentials

I also tried exporting to environment variables

export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_KEY=...

As well as specifying in config.yaml. When I specify in config.yaml, it gives a different error: Exception in thread "main" com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: The security token included in the request is invalid. (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: UnrecognizedClientException; Request ID: G9FMU2QPUE87VAJ0BEA230KPKRVV4KQNSO5AEMVJF66Q9ASUAAJG)

pdbossman commented 3 months ago

This seems to be requiring instance profile rather than using config file. I don't have permissions to change the instance profile. I'm not sure how to change the code to get it to use the default chain, but I think that would do the trick. As it would then walk through different/common methods of providing auths.

julienrf commented 3 months ago

@pdbossman Thank you for the detailed report. I confirm that I reproduced the issue when the credentials are provided via ~/.aws/credentials on the Spark worker node. It seems they are picked up by the master node, but not by the worker node… I will investigate further.

However, I could not reproduce your problem when the credentials are provided in the config.yaml file. In such a case, the migration runs fine for me. Would you mind sharing your whole config.yaml (without the credentials section)? Did you also try to remove the ~/.aws credentials from the Spark master and worker nodes when the credentials were provided by the config.yaml file?

pdbossman commented 3 months ago

Hi Julien, I may need to walk Lubos through this part and have him work with you. We're using Okta and gimme-aws-creds and it produces the aws credentials file which has the following components: [default] aws_access_key_id =... aws_secret_access_key =... aws_session_token =... x_security_token_expires =...

The generated credentials expire. I was providing the access key and secret access key from the generated security credentials, but it's clear to me now they cannot be used in that way and are tied to the token.

So when I normally run, and on the previous version of scylla-migrator, the source credentials are completely commented out. After running gimme-aws-creds, I would run aws configure, and the access key and secret key were pre-filled in from what gimme-aws-creds created, I only really had to run it to have it set the region. Then I didn't need to provide anything in the yaml file at all from the source except type, table name, and scanSegments.

Basically, I need this to work without providing credentials in the yaml file at all. If we need to have a quick meeting Monday, let me know.

pdbossman commented 3 months ago

Actually, to be clear - I think if you just fix the aws/credentials file usage for worker, you'll have solved the problem.

julienrf commented 3 months ago

@pdbossman I was able to use the AWS profile credentials with the following change: https://github.com/julienrf/scylla-migrator/tree/aws-credentials Could please let me know if that fixes your issue?

pdbossman commented 3 months ago

@pdbossman I was able to use the AWS profile credentials with the following change: https://github.com/julienrf/scylla-migrator/tree/aws-credentials Could please let me know if that fixes your issue?

It does! Thank you!

tarzanek commented 3 months ago

so we need it configurable so people can also go with ~/.aws/credentials on workers, resp. there is ONE more way, which is IAM role on top of VM with workers itself (which I'd expect is default in EMR)

tarzanek commented 3 months ago

fwiw for our access @pdbossman we use assumed role, so we will need support for something like this: https://stackoverflow.com/questions/44316061/does-spark-allow-to-use-amazon-assumed-role-and-sts-temporary-credentials-for-dy

tarzanek commented 3 months ago

but let's go step by step, let's fix current access I will merge https://github.com/scylladb/scylla-migrator/pull/123 @julienrf , can you only amend a ref to this issue?

tarzanek commented 3 months ago

next step would be to make it configurable, so basically it will either go down hierarchy until it finds credentials options: metadata token credentials in .aws/credentials of the user that runs executor (and masters) assumed role (as per what Patrick is doing) and old (likely unsecure) way of just access key and secret

tarzanek commented 3 months ago

com.amazonaws.auth.InstanceProfileCredentialsProvider seems to have a chain already, so let's see how we can use it / optimize it / configure it