aws / sagemaker-python-sdk

A library for training and deploying machine learning models on Amazon SageMaker
https://sagemaker.readthedocs.io/
Apache License 2.0
2.1k stars 1.14k forks source link

feat: Add experiment_config and tags arguments to TrainingStep and ProcessingStep classes #2017

Open brightsparc opened 3 years ago

brightsparc commented 3 years ago

Describe the feature you'd like Extend the constructors for the TrainingStep and ProcessingStep classes to include an optional experiment_config dictionary which is passed down when creating the job args.

eg: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/workflow/steps.py#L136

How would this feature be used? Please describe. This feature would allow the caller to pass down an experiment and trial name to a step that is part of a pipeline.

from sagemaker.inputs import TrainingInput
from sagemaker.workflow.steps import TrainingStep

step_train = TrainingStep(
    name="AbaloneTrain",
    estimator=xgb_train,
    inputs={
        "train": TrainingInput(
            s3_data=step_process.properties.ProcessingOutputConfig.Outputs[
                "train"
            ].S3Output.S3Uri,
            content_type="text/csv"
        ),
        "validation": TrainingInput(
            s3_data=step_process.properties.ProcessingOutputConfig.Outputs[
                "validation"
            ].S3Output.S3Uri,
            content_type="text/csv"
        )
    },
    experiment_config={
        'ExperimentName': 'my-project', 
        'TrialName': 'my-commit-hash',
        'TrialComponentDisplayName': "Training",
    },
)

Describe alternatives you've considered An alternative would be to attach these trial components and tags after the fact, but this would require an additional call.

Additional context This would provide feature parity with AWS DataScience Step functions SDK. https://aws-step-functions-data-science-sdk.readthedocs.io/en/stable/sagemaker.html

ajaykarpur commented 3 years ago

Hi @brightsparc, thank you for the suggestion!

brightsparc commented 3 years ago

@ajaykarpur This PR #2082 has passed tests and ready for review/merge.

liujiaorr commented 6 months ago

Can I get an update if this request still needed?