getindata / kedro-sagemaker

Kedro Plugin to support running pipelines on AWS SageMaker.
https://kedro-sagemaker.readthedocs.io
Apache License 2.0
18 stars 6 forks source link

Processing Jobs tagging and multiple nodes running on the same instance. #20

Open DarkJoker9817 opened 7 months ago

DarkJoker9817 commented 7 months ago

Hello and thanks for this plugin. I have three questions: 1) I want to assign a tag (in addition to those assigned automatically like 'sagemaker:pipeline-execution-arn' and 'sagemaker:pipeline-step-name') to the Processing Jobs launched through this plugin. Is there a way to do this automatically without interfacing through the AWS management console? 2) I want that all the nodes runs on the same instance. The default behaviour is that when a step of the pipeline is executed a new instance is start-up and when it finishes the instance is shut-down increasing the total execution time. Is there a way to make it executing on the same instance to optimize execution time? 3) Moreover I have some nodes that can be executed in parallel, can I do this configuring the plugin in a certain way?