Open ynouri opened 3 years ago
Hello @ynouri ,
Thank you for using Amazon SageMaker.
It's an interesting feature request. The output data from Processor
instances is of a similar ilk. The output data plays the central role, rather than model artifacts of Estimator
training jobs.
As you mentioned, there are work-arounds that, as noted, are not exposed as attributes of an estimator instance.
We are always re-evaluating our backlog of features based on customer requests, so we appreciate the feedback on this feature.
Describe the feature you'd like
As documented here: https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html#sagemaker.estimator.EstimatorBase.model_data, the
EstimatorBase
class provides a convenient method to pointing to the model .tar.gz archive location in S3 onceestimator.fit()
has been called.Additionally to model data, SageMaker provides the ability to generate "output data" (different from "model data") when dumping files during training (e.g. experiment logs) to the directory defined by the environment variable
SM_OUTPUT_DATA_DIR
Having a similar property in the
EstimatorBase
class, pointing to the output data .tar.gz archive location in S3 would be useful for developers wishing to manipulate that archive. It could be named, for example,estimator.output_data
How would this feature be used? Please describe.
Example use case:
estimator.output_data
and download output data archive locally.Describe alternatives you've considered
Compute the output_data location manually (potentially re-using the
.model_data
property)