Release of StepFunctions Python SDK Version 2 and support timeline for Version 1
With Python 2 having reached End of Life on January 1, 2020 and the release of SageMaker Python SDK V2, we will be releasing V2 of the AWS Step Functions Data Science SDK. This issue describes the changes, release timeline for V2, and end-of-life support for V1.
This major version bump will include the following breaking changes:
Deprecate Python 2 support for the StepFunctions Python SDK
Upgrade sagemaker dependency from 1.x to 2.x
If you would like to try a V2 pre-release candidate of the SDK today, you can install the 2.0.0rc1 pre-release candidate for V2 from PyPI. You can find the instructions for migration from v1 to v2 of the Step Functions Data Science SDK below.
Timeline
We are targeting the official release of V2 for February 2021. The 2.0.0-rc1 pre-release was made available on September 23, 2020.
The “data” parameter for TrainingStep and TuningStep classes was changed from:
data: Information about the training data. Please refer to the ``fit()`` method of the associated estimator, as this can take any of the following forms:
* (str) - The S3 location where training data is saved.
* (dict[str, str] or dict[str, sagemaker.inputs.TrainingInput]) - If using multiple
channels for training data, you can specify a dict mapping channel names to
strings or :func:`~sagemaker.inputs.TrainingInput` objects.
* (sagemaker.session.s3_input) - Channel configuration for S3 data sources that can
provide additional information about the training dataset. See
:func:`sagemaker.session.s3_input` for full details.
* (sagemaker.amazon.amazon_estimator.RecordSet) - A collection of
Amazon :class:`Record` objects serialized and stored in S3.
For use with an estimator for an Amazon algorithm.
* (list[sagemaker.amazon.amazon_estimator.RecordSet]) - A list of
:class:`sagemaker.amazon.amazon_estimator.RecordSet` objects,
where each instance is a different channel of training data.
to
data: Information about the training data. Please refer to the ``fit()`` method of the associated estimator, as this can take any of the following forms:
* (str) - The S3 location where training data is saved.
* (dict[str, str] or dict[str, sagemaker.inputs.TrainingInput]) - If using multiple
channels for training data, you can specify a dict mapping channel names to
strings or :func:`~sagemaker.inputs.TrainingInput` objects.
* (sagemaker.inputs.TrainingInput) - Channel configuration for S3 data sources that can
provide additional information about the training dataset. See
:func:`sagemaker.inputs.TrainingInput` for full details.
* (sagemaker.amazon.amazon_estimator.RecordSet) - A collection of
Amazon :class:`Record` objects serialized and stored in S3.
For use with an estimator for an Amazon algorithm.
* (list[sagemaker.amazon.amazon_estimator.RecordSet]) - A list of
:class:`sagemaker.amazon.amazon_estimator.RecordSet` objects,
where each instance is a different channel of training data.
(sagemaker.session.s3_input) has been renamed to (sagemaker.inputs.TrainingInput) in Sagemaker Python SDK 2.x
The “input” parameter for TrainingPipeline and InferencePipeline classes was changed from:
inputs: Information about the training data. Please refer to the `fit()` method of the associated estimator, as this can take any of the following forms:
* (str) - The S3 location where training data is saved.
* (dict[str, str] or dict[str, `sagemaker.inputs.TrainingInput`]) - If using multiple channels for training data, you can specify a dict mapping channel names to strings or `sagemaker.inputs.TrainingInput` objects.
* (`sagemaker.session.s3_input`) - Channel configuration for S3 data sources that can provide additional information about the training dataset. See `sagemaker.session.s3_input` for full details.
* (`sagemaker.amazon.amazon_estimator.RecordSet`) - A collection of Amazon `Record` objects serialized and stored in S3. For use with an estimator for an Amazon algorithm.
* (list[`sagemaker.amazon.amazon_estimator.RecordSet`]) - A list of `sagemaker.amazon.amazon_estimator.RecordSet` objects, where each instance is a different channel of training data.
to
inputs: Information about the training data. Please refer to the `fit()` method of the associated estimator, as this can take any of the following forms:
* (str) - The S3 location where training data is saved.
* (dict[str, str] or dict[str, `sagemaker.inputs.TrainingInput`]) - If using multiple channels for training data, you can specify a dict mapping channel names to strings or `sagemaker.inputs.TrainingInput` objects.
* (`sagemaker.inputs.TrainingInput`) - Channel configuration for S3 data sources that can provide additional information about the training dataset. See `sagemaker.inputs.TrainingInput` for full details.
* (`sagemaker.amazon.amazon_estimator.RecordSet`) - A collection of Amazon `Record` objects serialized and stored in S3. For use with an estimator for an Amazon algorithm.
* (list[`sagemaker.amazon.amazon_estimator.RecordSet`]) - A list of `sagemaker.amazon.amazon_estimator.RecordSet` objects, where each instance is a different channel of training data.
(sagemaker.session.s3_input) has been renamed to (sagemaker.inputs.TrainingInput) in Sagemaker Python SDK 2.x
Release of StepFunctions Python SDK Version 2 and support timeline for Version 1
With Python 2 having reached End of Life on January 1, 2020 and the release of SageMaker Python SDK V2, we will be releasing V2 of the AWS Step Functions Data Science SDK. This issue describes the changes, release timeline for V2, and end-of-life support for V1.
This major version bump will include the following breaking changes:
If you would like to try a V2 pre-release candidate of the SDK today, you can install the 2.0.0rc1 pre-release candidate for V2 from PyPI. You can find the instructions for migration from v1 to v2 of the Step Functions Data Science SDK below.
Timeline
We are targeting the official release of V2 for February 2021. The 2.0.0-rc1 pre-release was made available on September 23, 2020.
Support timeline for V1
The last v1 release will be made in February 2021. After the February release, updates to v1 will be limited to critical bug fixes until August 2021.
Migration Instructions
Prerequisites:
Sagemaker SDK:
If your project is using the Sagemaker Python SDK, it must be upgraded to version 2.x.
Here is the official documentation for upgrading to Sagemaker Python SDK Version 2.x. https://sagemaker.readthedocs.io/en/stable/v2.html#breaking-changes
StepFunctions SDK:
Breaking changes were introduced to the interfaces of the following classes:
TrainingStep and TuningStep (https://github.com/aws/aws-step-functions-data-science-sdk-python/blob/master/src/stepfunctions/steps/sagemaker.py#L36-L50)
The “data” parameter for TrainingStep and TuningStep classes was changed from:
to
(sagemaker.session.s3_input) has been renamed to (sagemaker.inputs.TrainingInput) in Sagemaker Python SDK 2.x
TrainingPipeline and InferencePipeline (https://github.com/aws/aws-step-functions-data-science-sdk-python/blob/master/src/stepfunctions/template/pipeline/train.py#L43-L49)
The “input” parameter for TrainingPipeline and InferencePipeline classes was changed from:
to
(sagemaker.session.s3_input) has been renamed to (sagemaker.inputs.TrainingInput) in Sagemaker Python SDK 2.x