aws / amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
https://sagemaker-examples.readthedocs.io
Apache License 2.0
10.11k stars 6.77k forks source link

MXNet Container Endpoint Deployment Failed #1473

Open ngluna opened 4 years ago

ngluna commented 4 years ago

In notebook: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/reinforcement_learning/rl_hvac_coach_energyplus/rl_hvac_coach_energyplus.ipynb

Model deployment is unsuccessful. Testing the model endpoint with

action, action_mean, action_std = predictor.predict(np.array([0., 0., 2.,]))

Returns:

AttributeError: 'NoneType' object has no attribute 'predict'

To reproduce error:

  1. Launch SM Notebook instance
  2. Clone https://github.com/awslabs/amazon-sagemaker-examples.git
  3. Open https://github.com/awslabs/amazon-sagemaker-examples/blob/master/reinforcement_learning/rl_hvac_coach_energyplus/rl_hvac_coach_energyplus.ipynb
  4. Click "Cell" -> "Run All"
hongshanli23 commented 3 years ago

I got the same error. Error message from this cell

predictor = estimator_eval.deploy(initial_instance_count=1,
                             instance_type='ml.t3.medium',
                             source_dir='src',
                             entry_point='deploy-mxnet-coach.py')

is not too illuminating

Exception in thread Thread-10:
Traceback (most recent call last):
  File "/home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/local/image.py", line 618, in run
    _stream_output(self.process)
  File "/home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/local/image.py", line 677, in _stream_output
    raise RuntimeError("Process exited with code: %s" % exit_code)
RuntimeError: Process exited with code: 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/local/image.py", line 623, in run
    raise RuntimeError(msg)
RuntimeError: Failed to run: ['docker-compose', '-f', '/tmp/tmp62zck2uq/docker-compose.yaml', 'up', '--build', '--abort-on-container-exit'], Process exited with code: 1