UnexpectedStatusException: Error for Training job autogluon-sagemaker-training-2020-05-10-07-12-25-701: Failed. Reason: AlgorithmError: ExecuteUserScriptError:
Command "/usr/local/bin/python3.6 train.py --label y"
The log provided by training container was bellow.
Traceback (most recent call last):
File "train.py", line 20, in <module>
import autogluon as ag
File "package/autogluon/__init__.py", line 10, in <module>
from . import scheduler, searcher, utils
File "package/autogluon/scheduler/__init__.py", line 6, in <module>
from .fifo import *
File "package/autogluon/scheduler/fifo.py", line 17, in <module>
from ..searcher import BaseSearcher
File "package/autogluon/searcher/__init__.py", line 2, in <module>
from .skopt_searcher import *
File "package/autogluon/searcher/skopt_searcher.py", line 6, in <module>
from skopt import Optimizer
File "package/skopt/__init__.py", line 55, in <module>
from .searchcv import BayesSearchCV
File "package/skopt/searchcv.py", line 16, in <module>
from sklearn.utils.fixes import MaskedArray
ImportError: cannot import name 'MaskedArray'
2020-05-11 01:47:03,374 sagemaker-containers ERROR ExecuteUserScriptError:
Command "/usr/local/bin/python3.6 train.py --label y"
Traceback (most recent call last):
File "/usr/local/bin/dockerd-entrypoint.py", line 20, in <module>
subprocess.check_call(shlex.split(' '.join(sys.argv[1:])))
File "/usr/local/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['train']' returned non-zero exit status 1.
To reproduce
Create the SageMaker notebook instance
Create new role which can access any S3 bucket
Set Git repository option to clone the amazon-sagemaker example repository to the notebook instance only.
Description of the error
when running the following cell in AutoGluon_Tabular_SageMaker.ipynb:
I got the following error:
The log provided by training container was bellow.
To reproduce