aws-deepracer-community / deepracer-for-cloud

Creates an AWS DeepRacing training environment which can be deployed in the cloud, or locally on Ubuntu Linux, Windows or Mac.
MIT No Attribution
325 stars 176 forks source link

Sagemaker is not running. #135

Closed janice880624 closed 1 year ago

janice880624 commented 1 year ago
Wiping path s3://bucket/rl-deepracer-sagemaker.
delete: s3://bucket/rl-deepracer-sagemaker/reward_function.py
delete: s3://bucket/rl-deepracer-sagemaker/training_params.yaml
Creating Robomaker configuration in s3://bucket/rl-deepracer-sagemaker/training_params.yaml
Updating service deepracer-0_rl_coach (id: 2vm5453qsh95tku57dyyjyw2k)
Updating service deepracer-0_robomaker (id: aj14r054bbpr4ihtgicf5gzp8)
Waiting up to 15 seconds for Sagemaker to start up...
Sagemaker is not running.

system.env

DR_CLOUD=local
DR_AWS_APP_REGION=us-east-1
DR_UPLOAD_S3_PROFILE=default
DR_UPLOAD_S3_BUCKET=aws-deepracer-assets-195dbc60-4d07-4380-b5ab-52eeaeb15376
DR_UPLOAD_S3_ROLE=to-be-defined
DR_LOCAL_S3_BUCKET=bucket
DR_LOCAL_S3_PROFILE=minio
DR_GUI_ENABLE=False
DR_KINESIS_STREAM_NAME=
DR_CAMERA_MAIN_ENABLE=True
DR_CAMERA_SUB_ENABLE=False
DR_CAMERA_KVS_ENABLE=True
DR_SAGEMAKER_IMAGE=5.1.0-gpu
DR_ROBOMAKER_IMAGE=5.1.0-cpu-avx2
DR_MINIO_IMAGE=latest
DR_ANALYSIS_IMAGE=cpu
DR_COACH_IMAGE=5.1.0
DR_WORKERS=1
DR_ROBOMAKER_MOUNT_LOGS=False
# DR_ROBOMAKER_MOUNT_SIMAPP_DIR=
DR_CLOUD_WATCH_ENABLE=False
DR_DOCKER_STYLE=swarm
DR_HOST_X=False
DR_WEBVIEWER_PORT=8100
# DR_DISPLAY=:99
# DR_REMOTE_MINIO_URL=http://mynas:9000
# DR_ROBOMAKER_CUDA_DEVICES=0
# DR_SAGEMAKER_CUDA_DEVICES=0
larsll commented 1 year ago

The updating service part normally means that containers don't start again. Try dr-stop-training and then dr-start-training.

Also suggest to head over to the Community Slack - see https://deepracing.io/ - you will get faster help for configuration issues there.