aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.37k stars 3.77k forks source link

sagemaker: can not launch studio app for a SSO user that is created with CDK #23627

Open hossein-jazayeri opened 1 year ago

hossein-jazayeri commented 1 year ago

Describe the bug

I'd like to create sagemaker user and app in a stack along the sagemaker domain using the SSO users in the account. While the stack is deployed without any error, the attempts to open the studio app from the console, yield the following error:

Access Denied. Please check if user is assigned to Studio Domain [...] and SSO Application [Amazon SageMaker Studio (...)] is Active.

Expected Behavior

The studio app should create jupyter server without any issues.

Current Behavior

The user and app are created with the stack successfully, but upon accessing the jupyter server via user's profile in the console, it fails with the above mentioned error.

Reproduction Steps

Here's the stack:

class SagemakerDomainUsersStack(Stack):
    def __init__(self, scope: Construct, construct_id: str, config: dict, **kwargs):
        super().__init__(scope, construct_id, **kwargs)

        domain = aws_sagemaker.CfnDomain(
            scope=scope,
            id="sagemaker-domain",
            auth_mode="SSO",
            default_user_settings=aws_sagemaker.CfnDomain.UserSettingsProperty(execution_role=config["execution_role"]),
            domain_name=config["domain_name"],
            subnet_ids=config["subnet_ids"],
            vpc_id=config["vpc_id"],
        )

        for user in config["users"]:
            user_profile = aws_sagemaker.CfnUserProfile(
                scope=scope,
                id=...,
                domain_id=domain.attr_domain_id,
                user_profile_name=user["aws_username"].split("@")[0].replace(".", "-"),
                single_sign_on_user_identifier="UserName",
                single_sign_on_user_value=user["aws_username"],
            )

            aws_sagemaker.CfnApp(
                scope=scope,
                id=...
                app_name="default",
                app_type="JupyterServer",
                domain_id=domain.attr_domain_id,
                user_profile_name=user_profile.user_profile_name,
            )

Configurations look like this:

vpc_id: ...
subnet_ids:
  - ...
  - ...
execution_role: ...
domain_name: ...
users:
  - aws_username: first.user@postnl.nl
  - aws_username: second.user@postnl.nl

Possible Solution

No response

Additional Information/Context

CDK CLI Version

2.59.0

Framework Version

No response

Node.js Version

v18.0.0

OS

Linux

Language

Python

Language Version

Python (3.10.8)

Other information

No response

davyto commented 1 year ago

I get the same error using the python cdk

pahud commented 1 year ago

Hi

After you CDK deploy, you need to assign users and groups to the new domain. And you probably will see the ResourceInUse error though, but you will now be able to launch the App.

image image

I will reach out to the relevant team internally for this, but this is how it works now.

My CDK code:

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { aws_sagemaker as sagemaker,
aws_ec2 as ec2,
aws_iam as iam } from 'aws-cdk-lib';
import { IamResource } from 'aws-cdk-lib/aws-appsync';

export class SagemakerTsStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const vpc = ec2.Vpc.fromLookup(this, 'Vpc', { isDefault: true });
    const executionRole = new iam.Role(this, 'ExecutionRole', {
      managedPolicies: [
        iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonSageMakerFullAccess'),
        iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonSageMakerCanvasFullAccess'),
      ],
      assumedBy: new iam.ServicePrincipal('sagemaker.amazonaws.com'),
    });

    const domain = new sagemaker.CfnDomain(this, 'Domain', {
      authMode: 'SSO',
      subnetIds: vpc.selectSubnets({
        subnetType: ec2.SubnetType.PUBLIC,
      }).subnetIds,
      defaultUserSettings: {
        executionRole: executionRole.roleArn,
      },
      vpcId: vpc.vpcId,
      domainName: 'mydomain',
    });

    const userProfile = new sagemaker.CfnUserProfile(this, 'UserProfile', {
      domainId: domain.ref,
      userProfileName: 'pahudProfile',
      singleSignOnUserIdentifier: 'UserName',
      singleSignOnUserValue: 'pahud',
    });

    const jupyter = new sagemaker.CfnApp(this, 'Jupiter', {
      appName: 'default',
      appType: 'JupyterServer',
      domainId: domain.ref,
      userProfileName: userProfile.userProfileName,
    });

    jupyter.node.addDependency(userProfile);
  }
}

Let me know if it works with you.

hossein-jazayeri commented 1 year ago

The whole idea is to not go to the console! With CDK we should be able to create a new domain and assign users & apps without the need to go to the console to adjust things! So, no, this is not an acceptable solution!

pahud commented 1 year ago

@hossein-jazayeri

Agree. This is definitely not an acceptable user experience. We have reported this internally to the relevant team. I will update here when I have any news. This is related to SageMaker with SSO and nothing CDK can do at this moment.

marcellovictorino commented 1 year ago

@pahud I can confirm this issue is still happening. Any updates on this? I appreciate it falls under the responsibility of a different team, but just wondering when this will get solved.

mkielar commented 1 year ago

@pahud, we're seeing the same issue when using terraform, AWS CLI and boto3. We also looked up things in CloudTrail, and we can see, that - when assigning users via AWS Web Console - there are additional events in CloudTrail, named AssociateProfile, which look more or less like this:

    ...
    "eventName": "AssociateProfile",
    "awsRegion": "eu-west-1",
    "sourceIPAddress": "188.114.87.11",
    "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36",
    "requestParameters": {
        "accessorId": "9367087262-04261254-3711-4406-bf1c-bf9927e68ce2",
        "accessorType": "USER",
        "accessorDisplay": {
            "userName": "HIDDEN_DUE_TO_SECURITY_REASONS"
        },
        "directoryId": "d-9367087262",
        "directoryType": "UserPool",
        "instanceId": "ins-55299dee38867b69",
        "profileId": "p-046276c336a0b9c0"
    },

We can also see ListProfileAssociations events when navigating to Identity Center WebConsole and clicking through the "Application Assignments" section.

However, both AssociateProfile and ListProfileAssociations are supposed to be deprecated according to https://docs.aws.amazon.com/singlesignon/latest/userguide/security-authorization.html, and were replaced by CreateAccountAssignment and ListAccountAssignments API Calls respectively.

This looks like Sagemaker / SSO still internally use some old API Endpoints which are no longer available to external clients (including your own AWS CLI and boto3). And - what's even worse - the new CreateAccountAssignment, ListAccountAssignments and ListInstances API calls don't at all support "Application assignments"!

What's even worse, is that:

  1. CreateAccountAssignment requires specifying a PermissionSetARN, only supports ACCOUNT type targets, and must be executed on the AWS Account where SSO is configugured (that's Root Account in our case)
  2. Whereas it seems that the AssociateProfile is executed on the DEV Account, where we deploy SageMaker Studio, and somehow it is able to provision SSO Applications and SSO Application Assignments on the Root Account, even though the user doing this using WebConsole doesn't have any permission to mess around on the Root Account! This looks like the WebConsole is doing some internal, cross-account magic that allows the user to create/modify/delete resources on RootAccount without explicit IAM Permissions!
l3ku commented 11 months ago

Can confirm that this is still happening. We would like to create a SageMaker domain and assign SSO users to the domain via IaC, but can not get it working without tinkering around in the console. Any updates from the relevant team?

jmeisele commented 10 months ago

@mkielar I know this open issue is an aws-cdk issue but what would be the correct way via code to add this missing AssociateProfile API call in your opinion?

jmeisele commented 10 months ago

@l3ku this looks like a possible lead for Terraform folks? https://github.com/hashicorp/terraform-provider-aws/issues/28958

mkielar commented 10 months ago

@jmeisele, I don't think there's anything to do for Terraform or CDK to call that API. The API is deprecated, you cannot find any implementation of it in boto3 or any other AWS SDK, so the only way that comes to mind would be to look up some very old version of some SDK (before the AssociateProfile call got deprecated) and see how it worked. If that's successful, I'd try to implement a call to that API (if it's even available publicly, still, which may not be the case) and just wrap that with (manually crafted) AWS Signature V4 and see if it works.

There's a lot of ifs here, and even more places where things may fail because - officially, at least - the endpoint doesn't exist.

I think we just have to wait for AWS / SSO Team to implement missing pieces in the new CreateAccountAssignment API.

jmeisele commented 10 months ago

@mkielar received word from my AWS colleague, this is on the roadmap late 2023, Q1 2024

dustindortch commented 10 months ago

@mkielar received word from my AWS colleague, this is on the roadmap late 2023, Q1 2024

Can we get a link to this being noted in a roadmap so that it can be tracked? This seems like such a major miss.

ichandra-forrester commented 9 months ago

I am facing similar issue. We are using SSO to manage users for Sagemaker. Recently I deleted one of the user from Sagemaker and tried to add again, getting error- User assignment already exist for given users: FIRST LAST. What is the solution? @pahud any help would be highly appreciated.

pahud commented 7 months ago

Unfortunately we don't have any workaround from CDK's perspective.

Are we still having the issue now?

IshwarChandra commented 7 months ago

I am facing similar issue. We are using SSO to manage users for Sagemaker. Recently I deleted one of the user from Sagemaker and tried to add again, getting error- User assignment already exist for given users: FIRST LAST. What is the solution? @pahud any help would be highly appreciated.

I was able to fix this. Deleting a user from Sagemaker doesn't delete it from apps under Identity Center, I had to delete it separately. After that I was able to add the user again.

jmeisele commented 7 months ago

@mkielar @hossein-jazayeri @pahud keep an eye out for for the next terraform-provider-aws release. One of the output attributes is going to be the single sign on ARN for the Sagemaker domain. This can be combined with new resource aws_ssoadmin_application_assignment to assign to groups.

https://github.com/hashicorp/terraform-provider-aws/issues/34673 https://github.com/hashicorp/terraform-provider-aws/pull/34741