aws / sagemaker-python-sdk

A library for training and deploying machine learning models on Amazon SageMaker
https://sagemaker.readthedocs.io/
Apache License 2.0
2.1k stars 1.14k forks source link

Passed in session should be used for feature group ingest #3332

Open sampoorna opened 2 years ago

sampoorna commented 2 years ago

Describe the bug My default credentials do not have write access to S3 for a feature, so I need to assume a role using boto/STS. I create all the relevant clients, a SageMaker session, and a feature group using the sagemaker session. Then I call .ingest() on the feature group. I have to key in an MFA code when I first assume role. I would expect not to have to keep keying this in during the ingestion process as well, however, my script keeps prompting me for it over and over again.

I have traced this down to sagemaker_featurestore_runtime_client = boto3.Session(profile_name=profile_name).client( service_name="sagemaker-featurestore-runtime", config=client_config ) in _ingest_single_batch(). It is not using the session that was passed in, but creating a new session, which prompts for an MFA code each time.

To reproduce

  1. Ensure "default" creds profile does not have access to write to FeatureGroup
  2. Ensure "" creds profile does have access to write to FeatureGroup. Set the policy to require multi-factor authentication.
  3. Create a boto session using the second profile, and initialise all the relevant clients:
    sandbox_session = boto3.session.Session(profile_name='<profile_name>')
    sagemaker_client = sandbox_session.client('sagemaker')
    sm_runtime_client = sandbox_session.client('sagemaker-runtime')
    sm_featurestore_client = sandbox_session.client('sagemaker-featurestore-runtime')
    sagemaker_session = Session(
       sagemaker_client=sagemaker_client, 
       boto_session=sandbox_session, 
       sagemaker_featurestore_runtime_client=sm_featurestore_client,
       sagemaker_runtime_client=sm_runtime_client
    )
    feature_group = FeatureGroup(
      name=feature_group_name, sagemaker_session=sagemaker_session
    )
  4. Attempt to ingest a dataframe to a FeatureGroup using .ingest() using profile_name (this technically should not even be required if the passed in session is being used, but the intention is to demonstrate that this isn't a viable workaround):
    feature_group.ingest(
    data_frame=cleaned_and_transformed_df, max_workers=3, wait=True, profile_name='<profile_name>'
    )
  5. User will be prompted for MFA code once when creating the first session, and then multiple times in succession (in parallel) during the ingestion.

Expected behavior I expect the initial session would be used for ingestion, otherwise this API method is unusable if MFA is enabled on the assumed profile.

System information A description of your system. Please provide:

doronbt commented 1 year ago

@sampoorna were you able to solve this issue?

sampoorna commented 1 year ago

@doronbt Nope. My workaround was to switch to a user account that did not require MFAs.

lorenzwalthert commented 2 months ago

I have the same problem without the MFA part, but basically that the passed in sagemaker session is not used. I want to pass a sagemaker session with a role assumed, since the currently active / default boto3 session does not have privileges to write to the feature store. Can you please fix this, i.e. ensure that the boto3 client from sagemaker_session is taken if supplied? We have premium support.