aws / sagemaker-python-sdk

A library for training and deploying machine learning models on Amazon SageMaker
https://sagemaker.readthedocs.io/
Apache License 2.0
2.11k stars 1.14k forks source link

Executing sagemaker.get_execution_role() locally #300

Closed opringle closed 10 months ago

opringle commented 6 years ago

Please fill out the form below.

System Information

Describe the problem

Minimal repro / logs

To reproduce the problem:

Script:

import sagemaker
import boto3

session = boto3.Session(profile_name='personal')
sagemaker_session = sagemaker.Session(boto_session=session)
role = sagemaker.get_execution_role(sagemaker_session=sagemaker_session)

Credentials:

[personal]
aws_secret_access_key = ******************
aws_access_key_id = *******************
region = us-west-2

Error:

Traceback (most recent call last):
  File "mwe.py", line 8, in <module>
    role = sagemaker.get_execution_role(sagemaker_session=sagemaker_session)
  File "/Users/opringle/.virtualenvs/vdcnn/lib/python3.6/site-packages/sagemaker/session.py", line 936, in get_execution_role
    arn = sagemaker_session.get_caller_identity_arn()
  File "/Users/opringle/.virtualenvs/vdcnn/lib/python3.6/site-packages/sagemaker/session.py", line 766, in get_caller_identity_arn
    role = self.boto_session.client('iam').get_role(RoleName=role_name)['Role']['Arn']
  File "/Users/opringle/.virtualenvs/vdcnn/lib/python3.6/site-packages/botocore/client.py", line 314, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Users/opringle/.virtualenvs/vdcnn/lib/python3.6/site-packages/botocore/client.py", line 612, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.NoSuchEntityException: An error occurred (NoSuchEntity) when calling the GetRole operation: The user with name oliver_pringle cannot be found.
yangaws commented 6 years ago

Hi @opringle ,

The problem is, the get_execution_role() method is only used on AWS SageMaker notebook instances. So if you use it locally, it won't correctly parse your credential (from your stacktrace, I think you are using IAM user credential).

So if you want to use sagemaker locally, you can create an IAM role with enough SageMaker access permission. Then just directly use that role in your code.

Feel free to reopen this if you have more questions.

Thanks

leopd commented 5 years ago

This is really a pretty bad experience. get_execution_role() sounds like it's going to just figure out all the IAM/role/confusion/whatever to make SageMaker work. And on a notebook instance it does. But if you run that same code on your laptop it fails, sending customers into IAM/role/confusion limbo.

leopd commented 5 years ago

Without this it's basically impossible to write a simple set of code that works both on a SageMaker notebook instance and anywhere else. Which is a real barrier to people who want to build the SageMaker ecosystem.

laurenyu commented 5 years ago

understood. definitely agree that the SDK can do better here. I'll leave this issue open as a feature request, and hopefully we can prioritize this work in the near future. Thanks @leopd!

thomelane commented 5 years ago

Also having issues here, +1 to smoothing it out.

Soypete commented 5 years ago

same

iluoyi commented 5 years ago

A temp solution is re-use the IAM role attached to your notebook (when you create the notebook, you had one there). You can get its arn from IAM console.

stevehawley commented 5 years ago

I think local mode should work offline, what need is there to check credentials when running locally?

gilinachum commented 4 years ago

I have written this super hacky function to resolve the sagemaker execution role. it may fail miserably, and you should probably not use it at all. But, it may work in simple cases:

def resolve_sm_role():
    client = boto3.client('iam', region_name=region)
    response_roles = client.list_roles(
        PathPrefix='/',
        # Marker='string',
        MaxItems=999
    )
    for role in response_roles['Roles']:
        if role['RoleName'].startswith('AmazonSageMaker-ExecutionRole-'):
            print('Resolved SageMaker IAM Role to: ' + str(role))
            return role['Arn']
    raise Exception('Could not resolve what should be the SageMaker role to be used')
ricoms commented 4 years ago

sagemaker.get_execution_role() could basically get the environment variable AWS_ROLE_SESSION_NAME as it's documented for credentials setup, and that would fit local processing too. But, sorry, all AWS IAM needs a refactoring

NukaCody commented 4 years ago

Putting iluoyi's solution in code

try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client('iam')
    role = iam.get_role(RoleName='AmazonSageMaker-ExecutionRole-20191205T100050')['Role']['Arn']

A SageMaker execution role exists if you ever ran a job before, if not:

  1. Log onto the console -> IAM -> Roles -> Create Role
  2. Create a service-linked role with sagemaker.amazonaws.com
  3. Give the role AmazonSageMakerFullAccess
  4. Give the role AmazonS3FullAccess (<-- scope down if reasonable)

Then use the name in RoleName= like above

A potential long term solution would be to create a function that checks for an existing execution service role, if it does not exist, then create the new role.....but service-role creation with managed policies through boto3 IAM requires......patience....

larroy commented 4 years ago

Any plans to fix this? This is very annoying if you want to execute notebooks locally. get_execution_role should create a default role with SM permissions when called out of a notebook.

rodrigoheck commented 3 years ago

Nothing yet?

rapuckett commented 3 years ago

Almost three years later and this is still an issue?

TanjaNY commented 3 years ago

Got today "The current AWS identity is not a role: arn:aws:iam::XXXXXXXXXX:user/xxxxxxxx, therefore it cannot be used as a SageMaker execution role."

cccntu commented 3 years ago

The above solution (https://github.com/aws/sagemaker-python-sdk/issues/300#issuecomment-577957428) is in docs now: https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html

tchaton commented 2 years ago

No update there? This issue is 4 years old ...

ghost commented 2 years ago

Just stumbled across this issue. Will this issue ever be solved?

ioanfr commented 1 year ago

Inside SageMaker we can have multiple notebook instances and each notebook instance can have a different IAM role. When running your code locally get_execution_role will not work since there might be several roles dedicated to different SageMaker notebook instances. Therefore, you have to choose which is the right role to use.

In order to make your code work in both local and remote modes, you could instantiate a variable containing the specific value of IAM role, and implement a try block like here below.

local_variable_for_sm_role = “arn:aws:iam::XXXX:role/service-role/XXXXX”
try:
    role = sagemaker.get_execution_role()
except ValueError:
    role = local_variable_for_sm_role
celsofranssa commented 1 year ago

It seems that sagemaker-python-sdk team does not care about the community issues.

variable-ad commented 7 months ago

I got the same error. Tried everything, is it still an issue?

TanjaNY commented 7 months ago

I got the same error. Tried everything, is it still an issue?

I am getting around with: Created Sagemaker All Access Role and define role as the arn of this role, works for me. role = 'arn:aws:iam::ACCTNMRXXXX:role/SageMakerAllAccess'

liambolling commented 3 weeks ago

How is this not fixed and just closed?