Open Milogav opened 1 year ago
Using for example
source_dir='/home/ubuntu/src entry_point='pipeline/program/run.py
According to the documentation:
The absolute or relative path to the local Python source file that should be executed as the entry point to training. (Default: None). If
source_dir
is specified, thenentry_point
must point to a file located at the root ofsource_dir
.
Your entry point is not at the root of your source dir
Came here to report the same. Is there a reason for having entry_point
point to a file at the root of source_dir
?
I am asking this because it works when running code on AWS with entry_point
set to a file in a subdir of source_dir
, but only fails in local mode due to the aforementioned line that strips the directory.
For anyone who might want a quick workaround, you can subclass the Estimator/Framework class to fix this. For example, see _script_mode_hyperparam_update
method in the following code:
class CustomFramework(Framework):
def __init__(
self,
entry_point,
source_dir=None,
hyperparameters=None,
py_version=None,
framework_version=None,
image_uri=None,
distributions=None,
**kwargs,
):
super(CustomFramework, self).__init__(
entry_point, source_dir, hyperparameters, image_uri=image_uri, **kwargs
)
self.true_entry_point = entry_point
self.framework_version = framework_version
self.py_version = py_version
def _script_mode_hyperparam_update(self, code_dir: str, script: str) -> None:
local_code = get_config_value("local.local_code", self.sagemaker_session.config)
if self.sagemaker_session.local_mode and local_code:
# Workaround if in local mode
return super()._script_mode_hyperparam_update(code_dir, self.true_entry_point)
else:
# Otherwise execute as normal
return super()._script_mode_hyperparam_update(code_dir, script)
Describe the bug Setting source_dir and entry_point arguments for Estimator in local mode results in wrong entry_point path if the script is not located at the source dir root but inside a sub folder
https://github.com/aws/sagemaker-python-sdk/blob/697c465623da296dba0c98a49cfe78c0b0acde04/src/sagemaker/estimator.py#L884-L890
To reproduce
Using for example
results in
Which raises a ValueError due to the wrong path: https://github.com/aws/sagemaker-python-sdk/blob/697c465623da296dba0c98a49cfe78c0b0acde04/src/sagemaker/fw_utils.py#L199
Expected behavior Keep entry point path unchanged if source_dir is provided
Screenshots or logs If applicable, add screenshots or logs to help explain your problem.
System information A description of your system. Please provide:
Additional context Add any other context about the problem here.