Describe the bug
I tried to use the max_run parameter of sagemaker.pytorch.estimator.PyTorch to define the max run time in seconds, but it doesnt work. See the attached screenshot for an example. In the screenshot, I set max_run to be 603 seconds. But it didnt stop at 603, evidenced by the training time at 841s (at which I manually terminated the run)
To reproduce
Just set max_run of sagemaker.pytorch.estimator.PyTorch to be any integer value
Expected behavior
I expect the sagemaker training run to terminate when it has elapsed the seconds set in max_run
Screenshots or logs
See screenshot in description
System information
A description of your system. Please provide:
SageMaker Python SDK version: 2.207.1
Framework name (eg. PyTorch) or algorithm (eg. KMeans): Pytorch
Framework version: 2.2.0
Python version: 3.10.1
CPU or GPU: CPU locally, and GPU instance on Sagemaker
Describe the bug I tried to use the
max_run
parameter ofsagemaker.pytorch.estimator.PyTorch
to define the max run time in seconds, but it doesnt work. See the attached screenshot for an example. In the screenshot, I setmax_run
to be 603 seconds. But it didnt stop at 603, evidenced by the training time at 841s (at which I manually terminated the run)To reproduce Just set
max_run
ofsagemaker.pytorch.estimator.PyTorch
to be any integer valueExpected behavior I expect the sagemaker training run to terminate when it has elapsed the seconds set in
max_run
Screenshots or logs See screenshot in description
System information A description of your system. Please provide:
Additional context NA