Closed liyunrui closed 10 months ago
New Error
We're running [1] on SM notebook. For SM training, everything works as expected. However, it's missing executor_model in the modle.tar.gz.
We have seen 0_transformworkflow 1_predicttensorflow ensemble_model
but not seen executor_model
We have seen 0_transformworkflow 1_predicttensorflow ensemble_model
Please check which version of the merlin-systems
package you have installed. Since the 23.02.00 release version the deafult entrypoint model is called executor_model
(was previously called ensemble_model
).
I'm executed the notebook in [1] on my AWS environment. But got below error:
2023-06-23 06:01:56,268 sagemaker-training-toolkit ERROR Reporting training FAILURE 2023-06-23 06:01:56,268 sagemaker-training-toolkit ERROR Framework Error: Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/sagemaker_training/trainer.py", line 99, in train entry_point.run( File "/usr/local/lib/python3.8/dist-packages/sagemaker_training/entry_point.py", line 93, in run install(name=user_entry_point, path=environment.code_dir, capture_error=capture_error) File "/usr/local/lib/python3.8/dist-packages/sagemaker_training/entry_point.py", line 118, in install entry_point_type = _entry_point_type.get(path, name) File "/usr/local/lib/python3.8/dist-packages/sagemaker_training/_entry_point_type.py", line 43, in get if name.endswith(".sh"): AttributeError: 'NoneType' object has no attribute 'endswith' 'NoneType' object has no attribute 'endswith' 2023-06-23 06:01:56,268 sagemaker-training-toolkit ERROR Encountered exit_code 1
2023-06-23 06:02:32 Failed - Training job failed ProfilerReport-1687499644: Stopping
[1]. https://github.com/NVIDIA-Merlin/Merlin/blob/main/examples/sagemaker-tensorflow/sagemaker-merlin-tensorflow.ipynb