aws / aws-step-functions-data-science-sdk-python

Step Functions Data Science SDK for building machine learning (ML) workflows and pipelines on AWS
Apache License 2.0
285 stars 87 forks source link

adding tags to a Sagemaker estimator in the training step does not seem to be supported #200

Open evaie opened 9 months ago

evaie commented 9 months ago

Extract from the workbook "machine_learning_workflow_abalone.ipynb" When adding tags in the following estimator :

mes_tags = [{'key': 'cart', 'value': 'dataengineering'}]

xgb = sagemaker.estimator.Estimator( image_uris.retrieve("xgboost", region, "1.2-1"), sagemaker_execution_role, train_instance_count=1, train_instance_type="ml.m4.4xlarge", train_volume_size=5, output_path=bucket_path + "/" + prefix + "/single-xgboost", base_job_name=base_job_name, tags=mes_tags, sagemaker_session=session, ) No error when creating the sagemaker.estimator object

The workflow creation fails When running the command (later in the notebook): workflow.create() I got the exception : "InvalidDefinition: An error occurred (InvalidDefinition) when calling the CreateStateMachine operation: Invalid State Machine Definition: 'SCHEMA_VALIDATION_FAILED: The field "key" is not supported by Step Functions at /States/Train Step/Parameters"

Which is clearly related to the tags I previously added. Apparently, adding tags to a Sagemaker estimator in the training step does not seem to be supported by the current version of the SDK.

To reproduce You can comment "tags=mes_tags" and re-rerun the notebook and the state machine is created without any errors.

Logs Only the stack trace in the notebook