Add support to define Training parameters dynamically using placeholders in TrainingStep
Fixes #138, #39 and #42
Why is the change necessary?
Currently, it is not possible to use placeholders for Sagemaker Training properties. The properties cannot be defined dynamically, as they need to be defined in the Estimator which does not accept placeholders.
This change makes it possible to use placeholders for Training properties by using the parameters field that are passed down from the TrainingStep.
This addresses 3 issues:
138: Setting the Environment variables for TrainingStep will be possible by providing those variables in the parameters argument (See CreateTrainingJob request for more details)
39: Using a different input s3 location will be possible by using placeholders such as ExecutionInput to define the InputDataConfig in the parameters argument.
42: Setting EnableManagedSpotTraining will be possible by defining that parameter in the TrainingStep parameters argument (see [See CreateTrainingJob request)
Solution
Use the parameters field that is compatible with placeholders to define TrainingStep properties.
Merge the parameters that were generated from the Estimator with the input parameters:
The input parameters will overwrite the parameters generated from the Estimator if the properties were defined in both
All TrainingStep properties will be placeholder compatible except for data - which requires RecordSets as data depending on the estimator used to define the tuner.
The input parameters should follow the schema described in CreateTrainingJob API doc
Description
Add support to define Training parameters dynamically using placeholders in TrainingStep
Fixes #138, #39 and #42
Why is the change necessary?
Currently, it is not possible to use placeholders for Sagemaker Training properties. The properties cannot be defined dynamically, as they need to be defined in the Estimator which does not accept placeholders. This change makes it possible to use placeholders for Training properties by using the parameters field that are passed down from the TrainingStep.
This addresses 3 issues:
138: Setting the Environment variables for TrainingStep will be possible by providing those variables in the
parameters argument
(See CreateTrainingJob request for more details)39: Using a different input s3 location will be possible by using placeholders such as
ExecutionInput
to define the InputDataConfig in the parameters argument.42: Setting
EnableManagedSpotTraining
will be possible by defining that parameter in the TrainingStepparameters
argument (see [See CreateTrainingJob request)Solution
Use the
parameters
field that is compatible with placeholders to define TrainingStep properties. Merge the parameters that were generated from the Estimator with the input parameters:The input parameters will overwrite the parameters generated from the Estimator if the properties were defined in both
All TrainingStep properties will be placeholder compatible except for
data
- which requires RecordSets as data depending on the estimator used to define the tuner.The input parameters should follow the schema described in CreateTrainingJob API doc
Same solution was adopted for feat: Support placeholders for processing step
Testing
Added unit and integration tests
Pull Request Checklist
Please check all boxes (including N/A items)
Testing
Documentation
Title and description
Fixes #xxx
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license.