aws / sagemaker-python-sdk

A library for training and deploying machine learning models on Amazon SageMaker
https://sagemaker.readthedocs.io/
Apache License 2.0
2.1k stars 1.14k forks source link

TF Framework support for encrypting code upload #646

Closed gnbk-aws closed 5 years ago

gnbk-aws commented 5 years ago

System Information

Describe the problem

A large data science org has a requirement for encrypted S3 buckets. BlazingText works effectively and pushes data to encrypted S3 buckets. With the same parameters as BlazingText (role, session, subnets, security groups, and output_path) plus a few others for the Framework specifically (code_location, output_kms_key), I am getting a putObject error (access denied).

Minimal repro / logs

The problem is specifically in the process _prepare_for_training -> _stage_user_code_in_s3 -> tar_and_upload_dir -> object_upload_file. It looks like the Boto3 command here is not using the KMS key, even when one is provided to the framework. Here's the specific line: https://github.com/aws/sagemaker-python-sdk/blob/ddfc301bcd886ad8dbeaae41ec1164f4c1b60cad/src/sagemaker/fw_utils.py#L180

The upload_file command could have extra ServerSideEncryption and SSEMKSKeyId args for the S3 Transfer Manager.

estimator = TensorFlow(entry_point = 'model_file.py', role = role, output_path = 's3://encryptedbucket/key', code_location = 's3://encryptedbucket/code_key', hyperparameters = hyperparameter_dict, training_steps = training_steps, evaluation_steps = evaluation_steps, train_instance_count = 1, train_instance_type = 'ml.p3.2xlarge', sagemaker_session = sagemaker_session, subnets = ['subnet'], security_group_ids = ['securitygroup'], output_kms_ket = 'alias/aws/s3', train_volume_kms_key = 'alias/aws/s3')

estimator.fit(inputs = 's3://encryptedbucket/training_key')

Please let me know if you need any further information--thank you!

ChoiByungWook commented 5 years ago

Hello @gnbk-aws,

Thank you for the suggestion. I think this change would require us to modify at the very least the Estimator class to take in an additional parameter for the s3_kms_key: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/estimator.py#L51

I'll relay this to the team as a feature request, however I am unable to provide any ETA.

gnbk-aws commented 5 years ago

Hi @ChoiByungWook,

Thanks for the prompt response!

I noticed that the estimator has a "outout_kms_key" parameter used for encrypting training results. https://github.com/aws/sagemaker-python-sdk/blob/bf6805f32927b4e2b4250dddd497a317b91c4d52/src/sagemaker/estimator.py#L80 I believe this is how estimators provide encrypted training outputs to S3, and it could use the same for staging the user code. From _stage_user_code_in_s3 if you pass the existing KMS key param to tar_and_upload_dir, then we might have a quick fix.

Edit: I forked the repo and committed these edits to be a bit more explicit. It's a draft, of course, and I haven't tested it yet but the commit is here: https://github.com/gnbk-aws/sagemaker-python-sdk/commit/74d24515ebc6bca6df5355039835fdd41e740bda

mvsusp commented 5 years ago

Hi @gnbk-aws ,

Thanks for your code change. Unfortunately, for that change to work, it would be necessary to send the kms to the training container as well. Another possible issue is that the output bucket and the code bucket are different with potentially different kms keys.

We have your feature request in the roadmap.

Thanks for using SageMaker!

gnbk-aws commented 5 years ago

Hi @mvsusp ,

That makes sense. Since the buckets are distinct, you'd need a new framework param for the kms key of the upload_code bucket. And I see how the training containers need that key in order to pull the model.py file from S3, so you'd need to send it to the training containers with the other framework params like here: https://github.com/aws/sagemaker-python-sdk/blob/8b33a305110997e18b1f2f2a37b2d3cdd98b2f91/src/sagemaker/estimator.py#L852

I'm not sure what would need to be modified within the training container from there. Perhaps I'll look around for that later.

Thanks for following-through here :)

mvsusp commented 5 years ago

Hi @gnbk-aws,

The code changes to support server side encryption is already pushed to the master branch. We are expected to release a new version of the SDK today or tomorrow.

gnbk-aws commented 5 years ago

This is great news--thanks @mvsusp !

mvsusp commented 5 years ago

Hi @gnbk-aws,

We just released sagemaker 1.18.5 including Server Side Encryption support.

Thanks for using SageMaker.