The problem occurs when I create a training job using Sagemaker Python SDK.
with RemoteExecutor(instance_type="ml.g4dn.2xlarge", dependencies='./timeseries_env.yml', max_parallel_jobs=1, keep_alive_period_in_seconds=30) as executor:
future = executor.submit(training_job, arg1, arg2)
After dependencies are installed from a yaml file (content provided below) the job freezes for more than an hour on the following line: INFO: Invoking remote function inside conda environment: sagemaker-runtime-env.
Product Version
Issue Description
The problem occurs when I create a training job using Sagemaker Python SDK.
After dependencies are installed from a yaml file (content provided below) the job freezes for more than an hour on the following line: INFO: Invoking remote function inside conda environment: sagemaker-runtime-env.
Expected Behavior
The job doesn't freeze for so long on the mentioned line
Observed Behavior
No response
Product Category
Jobs
Feedback Category
Customer Support, Reliability and Stability, Startup Time and Latency
Other Details
No response