Open BaoshengHeTR opened 3 years ago
Hi @BaoshengHeTR, are you using Python SDK? If so, if you use the same path (https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/transformer.py#L59) for multiple different times, you should have the results stored in the same location in S3.
Hi @BaoshengHeTR, are you using Python SDK? If so, if you use the same path (https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/transformer.py#L59) for multiple different times, you should have the results stored in the same location in S3.
Yes. Doing that way makes new results append to the old ones, right? So can we set up an overwritting way? Like in Spark, we have write.mode("overwrite").
Any update on this? I also need an overwrite mode especially when the input S3 path is the output from a spark job.
Same issue here. It would be ideal to be able to overwrite previous results from batch inferences instead of appending them, and the same feature for processing jobs.
Throwing in another vote for this functionality. We had to modify our Airflow task to clean the directory before starting the prediction task, but it'd be nicer to be able to use .mode("overwrite") instead.
I did not find the doc on overwrite batch transform output If I try to run the same batch transform job multiple times along the time, how should I set the transformer to overwrite the output results (i.e., I don not change the
output_path
)