Closed aiqc closed 1 month ago
Within domain configuration, I found where the SageMaker user's EMR Assumable Role
and EMR Execution Role
attributes can be defined.
However, it is not clear what ARN values I should be using. Nor am I able to get the spark
context working in either kernel (Glue PySpark, SparkMagic PySpark)
I added glue to the list of services in the custom allowable policy example of the documentation and now the SparkMagic PySpark connection in the notebook works as expected. https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/getting-started.html#gs-runtime-role
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "EMRServerlessTrustPolicy",
"Effect": "Allow",
"Principal": {
"Service": [
"emr-serverless.amazonaws.com", #<-- the only entry in documentation
"glue.amazonaws.com" #<-- I added this entry
]
},
"Action": "sts:AssumeRole"
}
]
}
Maybe that fixed it? I don't know. This was supposed to be a fun thing to explore on Friday morning, but now it's Sunday night.
Closing this because I don't need help, but it is a pain point for sure
Question
How can I authorize my SageMaker Studio notebook to connect to my EMR Cluster?
Other Details
https://stackoverflow.com/questions/78962340/sagemaker-emr-cluster-select-emr-runtime-role-for-cluster