Closed spoecker closed 5 years ago
Looks like a regression of issue #26 caused by the recent update. I forgot to resync the python files in rl_coach/src/markov
Can you pull the most recent commit and try again?
I tried, sadly same result.
Can you post the full log from the terminal on the bottom there? It will have the stack trace of where this error is coming from and will give me a good idea of what is causing the issue
(sagemaker_venv) [CORP\spoecker@a-3962e11qoanik rl_coach]$ python rl_deepracer_coach_robomaker.py
Looking for config file: /home/spoecker/.sagemaker/config.yaml
Model checkpoints and other metadata will be stored at: s3://bucket/rl-deepracer-sagemaker
Uploading to s3://bucket/rl-deepracer-sagemaker
WARNING:sagemaker:Parameter image_name
is specified, toolkit
, toolkit_version
, framework
are going to be ignored when choosing the image.
s3.ServiceResource()
Using provided s3_client
INFO:sagemaker:Creating training-job with name: rl-deepracer-sagemaker
Starting training job
Using /home/spoecker/Desktop/Deepracer/robo/container for container temp files
Using /home/spoecker/Desktop/Deepracer/robo/container for container temp files
Trying to launch image: crr0004/sagemaker-rl-tensorflow:console
Creating tmprbhlxguc_algo-1-0bx09_1 ... done
Attaching to tmprbhlxguc_algo-1-0bx09_1
algo-1-0bx09_1 | $1 is train
algo-1-0bx09_1 | In train start.sh
algo-1-0bx09_1 | Current host is "algo-1-0bx09"
algo-1-0bx09_1 | Compiling changehostname.c
algo-1-0bx09_1 | Done Compiling changehostname.c
algo-1-0bx09_1 | 23:C 15 Jul 2019 03:06:47.323 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
algo-1-0bx09_1 | 23:C 15 Jul 2019 03:06:47.323 # Redis version=5.0.5, bits=64, commit=00000000, modified=0, pid=23, just started
algo-1-0bx09_1 | 23:C 15 Jul 2019 03:06:47.323 # Configuration loaded
algo-1-0bx09_1 | 23:M 15 Jul 2019 03:06:47.324 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
algo-1-0bx09_1 | 23:M 15 Jul 2019 03:06:47.324 # Server can't set maximum open files to 10032 because of OS error: Operation not permitted.
algo-1-0bx09_1 | 23:M 15 Jul 2019 03:06:47.324 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
algo-1-0bx091 | ._
algo-1-0bx091 | .-__ ''-._ algo-1-0bx09_1 | _.-
.
. ''-. Redis 5.0.5 (00000000/0) 64 bit
algo-1-0bx091 | .-`` .-.
\/ ., ''-.
algo-1-0bx09_1 | ( ' , .-|
, ) Running in standalone mode
algo-1-0bx091 | |`-.-...-
...-.`-._|'
_.-'| Port: 6379
algo-1-0bx091 | | `-. ._ / _.-' | PID: 23 algo-1-0bx09_1 |
-. `-. -./ _.-' _.-' algo-1-0bx09_1 | |
-.`-. `-..-' .-'.-'|
algo-1-0bx091 | | `-.-._ _.-'_.-' | http://redis.io algo-1-0bx09_1 |
-. `-.-.__.-'_.-' _.-' algo-1-0bx09_1 | |
-.`-. -.__.-' _.-'_.-'| algo-1-0bx09_1 | |
-.`-. .-'.-' |
algo-1-0bx091 | `-. -._
-..-'.-' .-'
algo-1-0bx091 | `-. `-..-' _.-'
algo-1-0bx091 | `-. _.-'
algo-1-0bx09_1 | `-.__.-'
algo-1-0bx09_1 |
algo-1-0bx09_1 | 23:M 15 Jul 2019 03:06:47.324 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
algo-1-0bx09_1 | 23:M 15 Jul 2019 03:06:47.324 # Server initialized
algo-1-0bx09_1 | 23:M 15 Jul 2019 03:06:47.324 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
algo-1-0bx09_1 | 23:M 15 Jul 2019 03:06:47.324 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
algo-1-0bx09_1 | 23:M 15 Jul 2019 03:06:47.324 * Ready to accept connections
algo-1-0bx09_1 | 15/07/2019 03:06:47 passing arg to libvncserver: -rfbport
algo-1-0bx09_1 | 15/07/2019 03:06:47 passing arg to libvncserver: 5800
algo-1-0bx09_1 | 15/07/2019 03:06:47 x11vnc version: 0.9.13 lastmod: 2011-08-10 pid: 24
algo-1-0bx09_1 | 15/07/2019 03:06:47
algo-1-0bx09_1 | 15/07/2019 03:06:47 wait_for_client: WAIT:0
algo-1-0bx09_1 | 15/07/2019 03:06:47
algo-1-0bx09_1 | 15/07/2019 03:06:47 initialize_screen: fb_depth/fb_bpp/fb_Bpl 24/32/2560
algo-1-0bx09_1 | 15/07/2019 03:06:47
algo-1-0bx09_1 | 15/07/2019 03:06:47 Listening for VNC connections on TCP port 5800
algo-1-0bx09_1 | 15/07/2019 03:06:47 Listening for VNC connections on TCP6 port 5900
algo-1-0bx09_1 | 15/07/2019 03:06:47 Listening also on IPv6 port 5800 (socket 6)
algo-1-0bx09_1 | 15/07/2019 03:06:47
algo-1-0bx09_1 |
algo-1-0bx09_1 | The VNC desktop is: e196104c793c:5800
algo-1-0bx09_1 | 15/07/2019 03:06:47 possible alias: e196104c793c::5800
algo-1-0bx09_1 | PORT=5800
algo-1-0bx09_1 | 2019-07-15 03:06:48,984 sagemaker-containers INFO Imported framework sagemaker_tensorflow_container.training
algo-1-0bx09_1 | 2019-07-15 03:06:48,990 sagemaker-containers INFO No GPUs detected (normal if no gpus installed)
algo-1-0bx09_1 | 2019-07-15 03:06:49,047 sagemaker-containers INFO No GPUs detected (normal if no gpus installed)
algo-1-0bx09_1 | 2019-07-15 03:06:49,065 sagemaker-containers INFO No GPUs detected (normal if no gpus installed)
algo-1-0bx09_1 | 2019-07-15 03:06:49,079 sagemaker-containers INFO Invoking user script
algo-1-0bx09_1 |
algo-1-0bx09_1 | Training Env:
algo-1-0bx09_1 |
algo-1-0bx09_1 | {
algo-1-0bx09_1 | "additional_framework_parameters": {
algo-1-0bx09_1 | "sagemaker_estimator": "RLEstimator"
algo-1-0bx09_1 | },
algo-1-0bx09_1 | "channel_input_dirs": {},
algo-1-0bx09_1 | "current_host": "algo-1-0bx09",
algo-1-0bx09_1 | "framework_module": "sagemaker_tensorflow_container.training:main",
algo-1-0bx09_1 | "hosts": [
algo-1-0bx09_1 | "algo-1-0bx09"
algo-1-0bx09_1 | ],
algo-1-0bx09_1 | "hyperparameters": {
algo-1-0bx09_1 | "s3_bucket": "bucket",
algo-1-0bx09_1 | "s3_prefix": "rl-deepracer-sagemaker",
algo-1-0bx09_1 | "aws_region": "us-east-1",
algo-1-0bx09_1 | "model_metadata_s3_key": "s3://bucket/custom_files/model_metadata.json",
algo-1-0bx09_1 | "RLCOACH_PRESET": "deepracer",
algo-1-0bx09_1 | "loss_type": "mean squared error"
algo-1-0bx09_1 | },
algo-1-0bx09_1 | "input_config_dir": "/opt/ml/input/config",
algo-1-0bx09_1 | "input_data_config": {},
algo-1-0bx09_1 | "input_dir": "/opt/ml/input",
algo-1-0bx09_1 | "is_master": true,
algo-1-0bx09_1 | "job_name": "rl-deepracer-sagemaker",
algo-1-0bx09_1 | "log_level": 20,
algo-1-0bx09_1 | "master_hostname": "algo-1-0bx09",
algo-1-0bx09_1 | "model_dir": "/opt/ml/model",
algo-1-0bx09_1 | "module_dir": "s3://bucket/rl-deepracer-sagemaker/source/sourcedir.tar.gz",
algo-1-0bx09_1 | "module_name": "training_worker",
algo-1-0bx09_1 | "network_interface_name": "eth0",
algo-1-0bx09_1 | "num_cpus": 8,
algo-1-0bx09_1 | "num_gpus": 0,
algo-1-0bx09_1 | "output_data_dir": "/opt/ml/output/data",
algo-1-0bx09_1 | "output_dir": "/opt/ml/output",
algo-1-0bx09_1 | "output_intermediate_dir": "/opt/ml/output/intermediate",
algo-1-0bx09_1 | "resource_config": {
algo-1-0bx09_1 | "current_host": "algo-1-0bx09",
algo-1-0bx09_1 | "hosts": [
algo-1-0bx09_1 | "algo-1-0bx09"
algo-1-0bx09_1 | ]
algo-1-0bx09_1 | },
algo-1-0bx09_1 | "user_entry_point": "training_worker.py"
algo-1-0bx09_1 | }
algo-1-0bx09_1 |
algo-1-0bx09_1 | Environment variables:
algo-1-0bx09_1 |
algo-1-0bx09_1 | SM_HOSTS=["algo-1-0bx09"]
algo-1-0bx09_1 | SM_NETWORK_INTERFACE_NAME=eth0
algo-1-0bx09_1 | SM_HPS={"RLCOACH_PRESET":"deepracer","aws_region":"us-east-1","loss_type":"mean squared error","model_metadata_s3_key":"s3://bucket/custom_files/model_metadata.json","s3_bucket":"bucket","s3_prefix":"rl-deepracer-sagemaker"}
algo-1-0bx09_1 | SM_USER_ENTRY_POINT=training_worker.py
algo-1-0bx09_1 | SM_FRAMEWORK_PARAMS={"sagemaker_estimator":"RLEstimator"}
algo-1-0bx09_1 | SM_RESOURCE_CONFIG={"current_host":"algo-1-0bx09","hosts":["algo-1-0bx09"]}
algo-1-0bx09_1 | SM_INPUT_DATA_CONFIG={}
algo-1-0bx09_1 | SM_OUTPUT_DATA_DIR=/opt/ml/output/data
algo-1-0bx09_1 | SM_CHANNELS=[]
algo-1-0bx09_1 | SM_CURRENT_HOST=algo-1-0bx09
algo-1-0bx09_1 | SM_MODULE_NAME=training_worker
algo-1-0bx09_1 | SM_LOG_LEVEL=20
algo-1-0bx09_1 | SM_FRAMEWORK_MODULE=sagemaker_tensorflow_container.training:main
algo-1-0bx09_1 | SM_INPUT_DIR=/opt/ml/input
algo-1-0bx09_1 | SM_INPUT_CONFIG_DIR=/opt/ml/input/config
algo-1-0bx09_1 | SM_OUTPUT_DIR=/opt/ml/output
algo-1-0bx09_1 | SM_NUM_CPUS=8
algo-1-0bx09_1 | SM_NUM_GPUS=0
algo-1-0bx09_1 | SM_MODEL_DIR=/opt/ml/model
algo-1-0bx09_1 | SM_MODULE_DIR=s3://bucket/rl-deepracer-sagemaker/source/sourcedir.tar.gz
algo-1-0bx09_1 | SM_TRAINING_ENV={"additional_framework_parameters":{"sagemaker_estimator":"RLEstimator"},"channel_input_dirs":{},"current_host":"algo-1-0bx09","framework_module":"sagemaker_tensorflow_container.training:main","hosts":["algo-1-0bx09"],"hyperparameters":{"RLCOACH_PRESET":"deepracer","aws_region":"us-east-1","loss_type":"mean squared error","model_metadata_s3_key":"s3://bucket/custom_files/model_metadata.json","s3_bucket":"bucket","s3_prefix":"rl-deepracer-sagemaker"},"input_config_dir":"/opt/ml/input/config","input_data_config":{},"input_dir":"/opt/ml/input","is_master":true,"job_name":"rl-deepracer-sagemaker","log_level":20,"master_hostname":"algo-1-0bx09","model_dir":"/opt/ml/model","module_dir":"s3://bucket/rl-deepracer-sagemaker/source/sourcedir.tar.gz","module_name":"training_worker","network_interface_name":"eth0","num_cpus":8,"num_gpus":0,"output_data_dir":"/opt/ml/output/data","output_dir":"/opt/ml/output","output_intermediate_dir":"/opt/ml/output/intermediate","resource_config":{"current_host":"algo-1-0bx09","hosts":["algo-1-0bx09"]},"user_entry_point":"training_worker.py"}
algo-1-0bx09_1 | SM_USER_ARGS=["--RLCOACH_PRESET","deepracer","--aws_region","us-east-1","--loss_type","mean squared error","--model_metadata_s3_key","s3://bucket/custom_files/model_metadata.json","--s3_bucket","bucket","--s3_prefix","rl-deepracer-sagemaker"]
algo-1-0bx09_1 | SM_OUTPUT_INTERMEDIATE_DIR=/opt/ml/output/intermediate
algo-1-0bx09_1 | SM_HP_S3_BUCKET=bucket
algo-1-0bx09_1 | SM_HP_S3_PREFIX=rl-deepracer-sagemaker
algo-1-0bx09_1 | SM_HP_AWS_REGION=us-east-1
algo-1-0bx09_1 | SM_HP_MODEL_METADATA_S3_KEY=s3://bucket/custom_files/model_metadata.json
algo-1-0bx09_1 | SM_HP_RLCOACH_PRESET=deepracer
algo-1-0bx09_1 | SM_HP_LOSS_TYPE=mean squared error
algo-1-0bx09_1 |
algo-1-0bx09_1 | Invoking script with the following command:
algo-1-0bx09_1 |
algo-1-0bx09_1 | /usr/bin/python3.6 training_worker.py --RLCOACH_PRESET deepracer --aws_region us-east-1 --loss_type mean squared error --model_metadata_s3_key s3://bucket/custom_files/model_metadata.json --s3_bucket bucket --s3_prefix rl-deepracer-sagemaker
algo-1-0bx09_1 |
algo-1-0bx09_1 |
algo-1-0bx09_1 | Initializing SageS3Client...
algo-1-0bx09_1 | Successfully downloaded model metadata from custom_files/model_metadata.json.
algo-1-0bx09_1 | Using the following hyper-parameters
algo-1-0bx09_1 | {
algo-1-0bx09_1 | "batch_size": 64,
algo-1-0bx09_1 | "beta_entropy": 0.01,
algo-1-0bx09_1 | "discount_factor": 0.999,
algo-1-0bx09_1 | "e_greedy_value": 0.05,
algo-1-0bx09_1 | "epsilon_steps": 10000,
algo-1-0bx09_1 | "exploration_type": "categorical",
algo-1-0bx09_1 | "loss_type": "mean squared error",
algo-1-0bx09_1 | "lr": 0.0003,
algo-1-0bx09_1 | "num_episodes_between_training": 20,
algo-1-0bx09_1 | "num_epochs": 10,
algo-1-0bx09_1 | "stack_size": 1,
algo-1-0bx09_1 | "term_cond_avg_score": 100000.0,
algo-1-0bx09_1 | "term_cond_max_episodes": 100000
algo-1-0bx09_1 | }
algo-1-0bx09_1 | Uploaded hyperparameters.json to S3
algo-1-0bx09_1 | Uploaded IP address information to S3: 172.18.0.3
algo-1-0bx09_1 | ## Creating graph - name: BasicRLGraphManager
algo-1-0bx09_1 | Loaded action space from file: [{'steering_angle': -25, 'speed': 3.0, 'index': 0}, {'steering_angle': -25, 'speed': 6, 'index': 1}, {'steering_angle': -12.5, 'speed': 3, 'index': 2}, {'steering_angle': -12.5, 'speed': 6, 'index': 3}, {'steering_angle': 0, 'speed': 3, 'index': 4}, {'steering_angle': 0, 'speed': 6, 'index': 5}, {'steering_angle': 12.5, 'speed': 3, 'index': 6}, {'steering_angle': 12.5, 'speed': 6, 'index': 7}, {'steering_angle': 25, 'speed': 3, 'index': 8}, {'steering_angle': 25, 'speed': 6, 'index': 9}]
algo-1-0bx09_1 | ## Creating agent - name: agent
algo-1-0bx09_1 | Checkpoint> Saving in path=['./checkpoint/0_Step-0.ckpt']
algo-1-0bx09_1 | Uploaded 3 files for checkpoint 0
algo-1-0bx09_1 | INFO:tensorflow:Froze 11 variables.
algo-1-0bx09_1 | INFO:tensorflow:Converted 11 variables to const ops.
algo-1-0bx09_1 | saved intermediate frozen graph: rl-deepracer-sagemaker/model/model_0.pb
[CORP\spoecker@a-3962e11qoanik deepracer]$ docker run --rm --name dr --env-file ./robomaker.env --network sagemaker-local -p 8080:5900 -it crr0004/deepracer_robomaker:console
rm: cannot remove 'build': No such file or directory
rm: cannot remove 'install': No such file or directory
Starting >>> deepracer_simulation
[0.293s] WARNING:colcon.colcon_ros.prefix_path.catkin:The path '/opt/ros/kinetic' in the environment variable CMAKE_PREFIX_PATH seems to be a catkin workspace but it doesn't contain any 'local_setup.*' files. Maybe the catkin version is not up-to-date?
Starting >>> sagemaker_rl_agent
Finished <<< sagemaker_rl_agent [0.79s]
Finished <<< deepracer_simulation [4.40s]
Summary: 2 packages finished [4.55s] 15/07/2019 03:06:51 passing arg to libvncserver: -rfbport 15/07/2019 03:06:51 passing arg to libvncserver: 5900 15/07/2019 03:06:51 x11vnc version: 0.9.13 lastmod: 2011-08-10 pid: 793 15/07/2019 03:06:51 15/07/2019 03:06:51 wait_for_client: WAIT:0 15/07/2019 03:06:51 15/07/2019 03:06:51 initialize_screen: fb_depth/fb_bpp/fb_Bpl 24/32/2560 15/07/2019 03:06:51 15/07/2019 03:06:51 Listening for VNC connections on TCP port 5900 15/07/2019 03:06:51 Listening for VNC connections on TCP6 port 5900 15/07/2019 03:06:51 listen6: bind: Address already in use 15/07/2019 03:06:51 Not listening on IPv6 interface. 15/07/2019 03:06:51
The VNC desktop is: 58d5a9fd7994:0 PORT=5900 ... logging to /root/.ros/log/97b300e8-a6ad-11e9-a42a-0242ac120002/roslaunch-58d5a9fd7994-794.log Checking log directory for disk usage. This may take awhile. Press Ctrl-C to interrupt Done checking log file disk usage. Usage is <1GB.
[ INFO] [1563160011.773430852]: rviz version 1.12.17 [ INFO] [1563160011.773487118]: compiled against Qt version 5.5.1 [ INFO] [1563160011.773507267]: compiled against OGRE version 1.9.0 (Ghadamon) started roslaunch server http://58d5a9fd7994:40725/
PARAMETERS
NODES /racecar/ controller_manager (controller_manager/spawner) / agent (deepracer_simulation/run_rollout_rl_agent.sh) better_odom (topic_tools/relay) car_reset_node (deepracer_simulation/car_node.py) gazebo (gazebo_ros/gzserver) gazebo_gui (gazebo_ros/gzclient) racecar_spawn (gazebo_ros/spawn_model) robot_state_publisher (robot_state_publisher/robot_state_publisher)
auto-starting new master process[master]: started with pid [843] ROS_MASTER_URI=http://localhost:11311
setting /run_id to 97b300e8-a6ad-11e9-a42a-0242ac120002 process[rosout-1]: started with pid [856] started core service [/rosout] IP: 172.18.0.2 (58d5a9fd7994) process[gazebo-2]: started with pid [872] process[gazebo_gui-3]: started with pid [879] process[racecar_spawn-4]: started with pid [884] process[racecar/controller_manager-5]: started with pid [892] process[robot_state_publisher-6]: started with pid [894] process[car_reset_node-7]: started with pid [895] process[better_odom-8]: started with pid [897] process[agent-9]: started with pid [905]
/usr/local/lib/python3.5/dist-packages/gym/envs/registration.py:14: PkgResourcesDeprecationWarning: Parameters to load are deprecated. Call .resolve and .require separately. result = entry_point.load(False) [INFO] [1563160015.604416, 1.286000]: Loading controller: right_steering_hinge_position_controller [INFO] [1563160015.845455, 1.516000]: Loading controller: joint_state_controller [INFO] [1563160015.863572, 1.534000]: Controller Spawner: Loaded controllers: left_rear_wheel_velocity_controller, right_rear_wheel_velocity_controller, left_front_wheel_velocity_controller, right_front_wheel_velocity_controller, left_steering_hinge_position_controller, right_steering_hinge_position_controller, joint_state_controller [INFO] [1563160015.875662, 1.542000]: Started controllers: left_rear_wheel_velocity_controller, right_rear_wheel_velocity_controller, left_front_wheel_velocity_controller, right_front_wheel_velocity_controller, left_steering_hinge_position_controller, right_steering_hinge_position_controller, joint_state_controller Loaded action space from file: [{'index': 0, 'speed': 3.0, 'steering_angle': -25}, {'index': 1, 'speed': 6, 'steering_angle': -25}, {'index': 2, 'speed': 3, 'steering_angle': -12.5}, {'index': 3, 'speed': 6, 'steering_angle': -12.5}, {'index': 4, 'speed': 3, 'steering_angle': 0}, {'index': 5, 'speed': 6, 'steering_angle': 0}, {'index': 6, 'speed': 3, 'steering_angle': 12.5}, {'index': 7, 'speed': 6, 'steering_angle': 12.5}, {'index': 8, 'speed': 3, 'steering_angle': 25}, {'index': 9, 'speed': 6, 'steering_angle': 25}] SIM_TRACE_LOG:0,0,4.4375,0.5318,-0.0566,0.00,0.00,0,0.0100,False,True,0.6404,1,21.88,1563160016.4792085
SIM_TRACE_LOG:0,1,4.4375,0.5318,-0.0566,-0.44,3.00,0,86887.7977,False,True,0.6404,1,21.88,1563160016.5610707
2019-07-15 03:07:04.701005: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
SIM_TRACE_LOG:0,0,4.4374,0.5318,-0.0566,0.00,0.00,0,0.0100,False,True,0.6400,1,21.88,1563160028.8222458
SIM_TRACE_LOG:0,1,4.4376,0.5318,-0.0563,-0.44,6.00,1,93501.1930,False,True,0.6408,1,21.88,1563160028.9398441
SIM_TRACE_LOG:0,2,4.4470,0.5307,-0.0607,0.00,3.00,4,89572.6359,False,True,0.6841,1,21.88,1563160028.9919877
SIM_TRACE_LOG:0,3,4.4645,0.5286,-0.0676,-0.44,6.00,1,99915.4623,False,True,0.7646,1,21.88,1563160029.0647988
SIM_TRACE_LOG:0,4,4.4996,0.5234,-0.0854,0.00,3.00,4,103377.3954,False,True,0.9266,2,21.88,1563160029.1309657
SIM_TRACE_LOG:0,5,4.5545,0.5148,-0.1080,-0.22,6.00,3,103597.1877,False,True,1.1803,2,21.88,1563160029.2354603
SIM_TRACE_LOG:0,6,4.6381,0.4991,-0.1433,-0.44,3.00,0,75404.3269,False,True,1.5672,3,21.88,1563160029.322305
SIM_TRACE_LOG:0,7,4.6935,0.4845,-0.1809,0.00,3.00,4,52164.1297,False,True,1.8253,3,21.88,1563160029.3778887
SIM_TRACE_LOG:0,8,4.7517,0.4684,-0.2118,-0.22,6.00,3,41922.7395,False,True,2.0969,4,21.88,1563160029.4541104
SIM_TRACE_LOG:0,9,4.8074,0.4504,-0.2464,0.44,6.00,9,18319.2985,False,True,2.3573,4,21.88,1563160029.502825
SIM_TRACE_LOG:0,10,4.8889,0.4215,-0.2913,0.22,3.00,6,2306.2885,False,True,2.7398,5,21.88,1563160029.5639482
SIM_TRACE_LOG:0,11,4.9708,0.3891,-0.3330,-0.22,3.00,2,2104.0866,False,True,3.1235,5,21.88,1563160029.6450453
SIM_TRACE_LOG:0,12,5.0249,0.3644,-0.3697,0.22,3.00,6,1943.0658,False,True,3.3791,6,21.88,1563160029.7375066
SIM_TRACE_LOG:0,13,5.0864,0.3358,-0.3956,0.00,3.00,4,1753.6179,False,True,3.6667,6,21.88,1563160029.807551
SIM_TRACE_LOG:0,14,5.1514,0.3027,-0.4246,-0.22,3.00,2,1531.0547,False,True,3.9747,7,21.88,1563160029.8644927
SIM_TRACE_LOG:0,15,5.1915,0.2809,-0.4451,0.00,3.00,4,1380.1086,False,True,4.1595,7,21.88,1563160029.9266427
SIM_TRACE_LOG:0,16,5.2388,0.2551,-0.4627,0.00,6.00,5,7226.9994,False,True,4.3838,7,21.88,1563160029.998425
SIM_TRACE_LOG:0,17,5.3053,0.2188,-0.4827,0.22,3.00,6,948.5265,False,True,4.6988,8,21.88,1563160030.076893
SIM_TRACE_LOG:0,18,5.3876,0.1742,-0.4898,-0.22,6.00,3,0.0100,False,False,5.0793,9,21.88,1563160030.1630979
SIM_TRACE_LOG:0,19,5.4402,0.1430,-0.5067,-0.22,6.00,3,0.0100,False,False,5.3205,9,21.88,1563160030.21294
SIM_TRACE_LOG:0,20,5.5262,0.0874,-0.5485,0.00,3.00,4,0.0100,False,False,5.7275,10,21.88,1563160030.2679574
SIM_TRACE_LOG:0,21,5.5571,0.0883,-0.7908,0.00,3.00,4,0.0100,False,False,5.8681,10,21.88,1563160030.3390045
SIM_TRACE_LOG:0,22,5.5968,0.0797,-1.1486,0.00,3.00,4,0.0100,False,False,6.0411,10,21.88,1563160030.408615
SIM_TRACE_LOG:0,23,5.6209,0.0635,-1.4066,-0.22,6.00,3,0.0100,False,False,6.1550,10,21.88,1563160030.4816608
SIM_TRACE_LOG:0,24,5.6379,0.0475,-1.6411,-0.22,3.00,2,0.0100,False,False,6.2361,11,21.88,1563160030.5628703
SIM_TRACE_LOG:0,25,5.6392,0.0490,-1.6598,0.44,3.00,8,0.0100,False,False,6.2420,11,21.88,1563160030.64669
SIM_TRACE_LOG:0,26,5.6391,0.0488,-1.6572,-0.22,3.00,2,0.0100,False,False,6.2420,11,21.88,1563160030.6922758
SIM_TRACE_LOG:0,27,5.6387,0.0482,-1.6492,0.44,6.00,9,0.0100,False,False,6.2420,11,21.88,1563160030.7776704
SIM_TRACE_LOG:0,28,5.6386,0.0479,-1.6456,0.44,6.00,9,0.0100,False,False,6.2420,11,21.88,1563160030.8289824
SIM_TRACE_LOG:0,29,5.6382,0.0471,-1.6369,0.22,3.00,6,0.0100,False,False,6.2420,11,21.88,1563160030.9031272
SIM_TRACE_LOG:0,30,5.6382,0.0468,-1.6330,0.22,6.00,7,0.0100,False,False,6.2420,11,21.88,1563160030.9813893
SIM_TRACE_LOG:0,31,5.6379,0.0464,-1.6280,0.22,6.00,7,0.0100,False,False,6.2420,11,21.88,1563160031.0461774
SIM_TRACE_LOG:0,32,5.6376,0.0460,-1.6230,-0.22,6.00,3,0.0100,False,False,6.2420,11,21.88,1563160031.1192632
SIM_TRACE_LOG:0,33,5.6373,0.0459,-1.6195,-0.44,3.00,0,0.0100,False,False,6.2420,11,21.88,1563160031.1935537
SIM_TRACE_LOG:0,34,5.6374,0.0458,-1.6205,0.44,3.00,8,0.0100,False,False,6.2420,11,21.88,1563160031.254932
SIM_TRACE_LOG:0,35,5.6374,0.0458,-1.6207,-0.22,3.00,2,0.0100,False,False,6.2420,11,21.88,1563160031.3329365
SIM_TRACE_LOG:0,36,5.6374,0.0458,-1.6206,0.22,6.00,7,0.0100,False,False,6.2420,11,21.88,1563160031.4067478
SIM_TRACE_LOG:0,37,5.6374,0.0457,-1.6205,-0.44,3.00,0,0.0100,False,False,6.2420,11,21.88,1563160031.4996529
SIM_TRACE_LOG:0,38,5.6373,0.0458,-1.6209,0.00,6.00,5,0.0100,False,False,6.2420,11,21.88,1563160031.544607
SIM_TRACE_LOG:0,39,5.6373,0.0458,-1.6206,0.44,6.00,9,0.0100,False,False,6.2420,11,21.88,1563160031.6019826
SIM_TRACE_LOG:0,40,5.6373,0.0458,-1.6206,-0.22,6.00,3,0.0100,False,False,6.2420,11,21.88,1563160031.6752388
SIM_TRACE_LOG:0,41,5.6373,0.0458,-1.6203,0.44,6.00,9,0.0100,False,False,6.2420,11,21.88,1563160031.75576
SIM_TRACE_LOG:0,42,5.6374,0.0458,-1.6206,0.44,6.00,9,0.0100,False,False,6.2420,11,21.88,1563160031.835736
SIM_TRACE_LOG:0,43,5.6374,0.0458,-1.6206,-0.22,3.00,2,0.0100,False,False,6.2420,11,21.88,1563160031.903566
SIM_TRACE_LOG:0,44,5.6374,0.0457,-1.6206,-0.22,3.00,2,0.0100,False,False,6.2420,11,21.88,1563160032.0048857
SIM_TRACE_LOG:0,45,5.6375,0.0459,-1.6194,0.00,3.00,4,0.0100,False,False,6.2420,11,21.88,1563160032.0623138
SIM_TRACE_LOG:0,46,5.6375,0.0458,-1.6208,0.00,6.00,5,0.0100,False,False,6.2420,11,21.88,1563160032.1338956
SIM_TRACE_LOG:0,47,5.6375,0.0457,-1.6206,0.00,6.00,5,0.0100,False,False,6.2420,11,21.88,1563160032.186223
SIM_TRACE_LOG:0,48,5.6374,0.0457,-1.6206,0.00,3.00,4,0.0100,False,False,6.2420,11,21.88,1563160032.2735958
SIM_TRACE_LOG:0,49,5.6374,0.0458,-1.6205,-0.44,3.00,0,0.0100,False,False,6.2420,11,21.88,1563160032.4049213
SIM_TRACE_LOG:0,50,5.6373,0.0457,-1.6209,0.22,6.00,7,0.0100,False,False,6.2420,11,21.88,1563160032.4572916
SIM_TRACE_LOG:0,51,5.6374,0.0459,-1.6209,0.44,3.00,8,0.0100,False,False,6.2420,11,21.88,1563160032.5045547
SIM_TRACE_LOG:0,52,5.6374,0.0458,-1.6204,0.00,3.00,4,0.0100,False,False,6.2420,11,21.88,1563160032.5468073
SIM_TRACE_LOG:0,53,5.6373,0.0458,-1.6207,-0.22,3.00,2,0.0100,False,False,6.2420,11,21.88,1563160032.6250565
SIM_TRACE_LOG:0,54,5.6373,0.0457,-1.6205,0.22,6.00,7,0.0100,False,False,6.2420,11,21.88,1563160032.7050037
SIM_TRACE_LOG:0,55,5.6373,0.0458,-1.6208,-0.22,6.00,3,0.0100,False,False,6.2420,11,21.88,1563160032.758985
SIM_TRACE_LOG:0,56,5.6372,0.0458,-1.6206,-0.44,6.00,1,0.0100,False,False,6.2420,11,21.88,1563160032.8285449
SIM_TRACE_LOG:0,57,5.6373,0.0458,-1.6203,0.22,3.00,6,0.0100,False,False,6.2420,11,21.88,1563160032.889745
SIM_TRACE_LOG:0,58,5.6373,0.0458,-1.6207,-0.22,6.00,3,0.0100,False,False,6.2420,11,21.88,1563160032.972706
SIM_TRACE_LOG:0,59,5.6373,0.0457,-1.6205,0.22,6.00,7,0.0100,False,False,6.2420,11,21.88,1563160033.095248
SIM_TRACE_LOG:0,60,5.6373,0.0458,-1.6206,-0.44,3.00,0,0.0000,True,False,6.2420,11,21.88,1563160033.151458
reward: 696968.5468506046 Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/urllib3/connection.py", line 160, in _new_conn (self._dns_host, self.port), self.timeout, **extra_kw) File "/usr/local/lib/python3.5/dist-packages/urllib3/util/connection.py", line 57, in create_connection for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM): File "/usr/lib/python3.5/socket.py", line 732, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/botocore/httpsession.py", line 262, in send chunked=self._chunked(request.headers), File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 641, in urlopen _stacktrace=sys.exc_info()[2]) File "/usr/local/lib/python3.5/dist-packages/urllib3/util/retry.py", line 344, in increment raise six.reraise(type(error), error, _stacktrace) File "/usr/local/lib/python3.5/dist-packages/urllib3/packages/six.py", line 686, in reraise raise value File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 603, in urlopen chunked=chunked) File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 344, in _make_request self._validate_conn(conn) File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 843, in _validate_conn conn.connect() File "/usr/local/lib/python3.5/dist-packages/urllib3/connection.py", line 316, in connect conn = self._new_conn() File "/usr/local/lib/python3.5/dist-packages/urllib3/connection.py", line 169, in _new_conn self, "Failed to establish a new connection: %s" % e) urllib3.exceptions.NewConnectionError: <botocore.awsrequest.AWSHTTPSConnection object at 0x7f54626aff60>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/rollout_worker.py", line 303, in
So it looks like it is still trying to call AWS to cancel the sim job, which results in a http error.
Can you run
docker run --rm --name dr --env-file ./robomaker.env --network sagemaker-local -p 8080:5900 -it crr0004/deepracer_robomaker:console cat /app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/rollout_worker.py
And post the results here?
I run docker run --rm --name dr --env-file ./robomaker.env --network sagemaker-local -p 8080:5900 -it crr0004/deepracer_robomaker:console cat /app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/rollout_worker.py
for 24h now.
Nothing happening, no console output
Hmm. Odd. I was just trying to see if there was anything in that file that is off. I will double check the command
You can try docker run --rm --name dr --env-file ./robomaker.env --network sagemaker-local -p 8080:5900 -it crr0004/deepracer_robomaker:console "colcon build; cat /app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/rollout_worker.py"
instead? It's meant to immediately return something. It if it hangs, something is wrong.
Can you also run docker run --rm --name dr --env-file ./robomaker.env --network sagemaker-local -p 8080:5900 -it crr0004/deepracer_robomaker:console "colcon build; cat /app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/environments/deepracer_racetrack_env.py"
?
Okay it looks like it was a multiple regression from the file syncing. The image was missing the code and the code was also wrong. Can you try pull the image, the repo and running again?
I pulled the images and the repo and run it again ... algo-1-i2cxi_1 | Uploaded hyperparameters.json to S3 algo-1-i2cxi_1 | Uploaded IP address information to S3: 172.18.0.4 algo-1-i2cxi_1 | ## Creating graph - name: BasicRLGraphManager algo-1-i2cxi_1 | Loaded action space from file: [{'steering_angle': -25, 'speed': 3.0, 'index': 0}, {'steering_angle': -25, 'speed': 6, 'index': 1}, {'steering_angle': -12.5, 'speed': 3, 'index': 2}, {'steering_angle': -12.5, 'speed': 6, 'index': 3}, {'steering_angle': 0, 'speed': 3, 'index': 4}, {'steering_angle': 0, 'speed': 6, 'index': 5}, {'steering_angle': 12.5, 'speed': 3, 'index': 6}, {'steering_angle': 12.5, 'speed': 6, 'index': 7}, {'steering_angle': 25, 'speed': 3, 'index': 8}, {'steering_angle': 25, 'speed': 6, 'index': 9}] algo-1-i2cxi_1 | ## Creating agent - name: agent algo-1-i2cxi_1 | Checkpoint> Saving in path=['./checkpoint/0_Step-0.ckpt'] algo-1-i2cxi_1 | Uploaded 3 files for checkpoint 0 algo-1-i2cxi_1 | INFO:tensorflow:Froze 11 variables. algo-1-i2cxi_1 | INFO:tensorflow:Converted 11 variables to const ops. algo-1-i2cxi_1 | saved intermediate frozen graph: rl-deepracer-sagemaker/model/model_0.pb algo-1-i2cxi_1 | Training> Name=main_level/agent, Worker=0, Episode=1, Total reward=2079263.68, Steps=42, Training iteration=0
... [ INFO] [1563508383.752479434, 0.380000000]: Physics dynamic reconfigure ready. [ INFO] [1563508383.801276497, 0.422000000]: Physics dynamic reconfigure ready. [INFO] [1563508383.934626, 0.552000]: Loading controller: right_rear_wheel_velocity_controller
/usr/local/lib/python3.5/dist-packages/gym/envs/registration.py:14: PkgResourcesDeprecationWarning: Parameters to load are deprecated. Call .resolve and .require separately. result = entry_point.load(False) [INFO] [1563508384.388128, 1.003000]: Loading controller: left_front_wheel_velocity_controller [INFO] [1563508384.709887, 1.323000]: Loading controller: right_front_wheel_velocity_controller [INFO] [1563508384.810976, 1.409000]: Loading controller: left_steering_hinge_position_controller Loaded action space from file: [{'speed': 3.0, 'index': 0, 'steering_angle': -25}, {'speed': 6, 'index': 1, 'steering_angle': -25}, {'speed': 3, 'index': 2, 'steering_angle': -12.5}, {'speed': 6, 'index': 3, 'steering_angle': -12.5}, {'speed': 3, 'index': 4, 'steering_angle': 0}, {'speed': 6, 'index': 5, 'steering_angle': 0}, {'speed': 3, 'index': 6, 'steering_angle': 12.5}, {'speed': 6, 'index': 7, 'steering_angle': 12.5}, {'speed': 3, 'index': 8, 'steering_angle': 25}, {'speed': 6, 'index': 9, 'steering_angle': 25}] [INFO] [1563508385.060141, 1.637000]: Loading controller: right_steering_hinge_position_controller [INFO] [1563508385.232757, 1.811000]: Loading controller: joint_state_controller [INFO] [1563508385.271250, 1.844000]: Controller Spawner: Loaded controllers: left_rear_wheel_velocity_controller, right_rear_wheel_velocity_controller, left_front_wheel_velocity_controller, right_front_wheel_velocity_controller, left_steering_hinge_position_controller, right_steering_hinge_position_controller, joint_state_controller [INFO] [1563508385.280768, 1.856000]: Started controllers: left_rear_wheel_velocity_controller, right_rear_wheel_velocity_controller, left_front_wheel_velocity_controller, right_front_wheel_velocity_controller, left_steering_hinge_position_controller, right_steering_hinge_position_controller, joint_state_controller SIM_TRACE_LOG:0,0,4.4374,0.5318,-0.0564,0.00,0.00,0,0.0100,False,True,0.6399,1,21.88,1563508385.4749124
SIM_TRACE_LOG:0,1,4.4374,0.5318,-0.0564,-0.44,3.00,0,86799.3431,False,True,0.6400,1,21.88,1563508385.5607224
2019-07-19 03:53:14.005197: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
SIM_TRACE_LOG:0,0,4.4374,0.5318,-0.0565,0.00,0.00,0,0.0100,False,True,0.6401,1,21.88,1563508398.1386523
SIM_TRACE_LOG:0,1,4.4382,0.5318,-0.0568,0.44,6.00,9,93798.2608,False,True,0.6437,1,21.88,1563508398.2581823
SIM_TRACE_LOG:0,2,4.4464,0.5314,-0.0559,-0.22,3.00,2,86822.5664,False,True,0.6815,1,21.88,1563508398.2943811
SIM_TRACE_LOG:0,3,4.4588,0.5310,-0.0536,-0.22,6.00,3,92995.6198,False,True,0.7377,1,21.88,1563508398.3413289
SIM_TRACE_LOG:0,4,4.4887,0.5297,-0.0529,0.00,3.00,4,85067.8961,False,True,0.8747,1,21.88,1563508398.4227524
SIM_TRACE_LOG:0,5,4.5367,0.5277,-0.0491,-0.44,3.00,0,82521.0044,False,True,1.0938,2,21.88,1563508398.5069866
SIM_TRACE_LOG:0,6,4.6109,0.5247,-0.0456,-0.44,3.00,0,74820.1111,False,True,1.4327,2,21.88,1563508398.5985994
SIM_TRACE_LOG:0,7,4.6710,0.5199,-0.0564,0.22,3.00,6,81336.4011,False,True,1.7079,3,21.88,1563508398.6519997
SIM_TRACE_LOG:0,8,4.7155,0.5159,-0.0654,0.22,3.00,6,88473.2405,False,True,1.9124,3,21.88,1563508398.6983905
SIM_TRACE_LOG:0,9,4.7783,0.5102,-0.0743,0.44,6.00,9,99996.1063,False,True,2.2007,4,21.88,1563508398.7886
SIM_TRACE_LOG:0,10,4.8622,0.5047,-0.0694,0.44,6.00,9,99680.9729,False,True,2.5852,4,21.88,1563508398.871944
SIM_TRACE_LOG:0,11,4.9583,0.5024,-0.0456,-0.22,6.00,3,89752.4556,False,True,3.0240,5,21.88,1563508398.9617953
SIM_TRACE_LOG:0,12,5.0775,0.5016,-0.0230,0.44,6.00,9,78668.7662,False,True,3.5676,6,21.88,1563508399.0140648
SIM_TRACE_LOG:0,13,5.1657,0.5016,-0.0116,0.00,6.00,5,73132.6373,False,True,3.9698,7,21.88,1563508399.0872471
SIM_TRACE_LOG:0,14,5.3063,0.5013,-0.0043,-0.44,3.00,0,64695.0845,False,True,4.6116,8,21.88,1563508399.169118
SIM_TRACE_LOG:0,15,5.4117,0.5007,-0.0065,0.22,6.00,7,77793.7254,False,True,5.0942,9,21.88,1563508399.2376678
SIM_TRACE_LOG:0,16,5.5115,0.4994,-0.0104,0.00,3.00,4,75708.7385,False,True,5.5510,9,21.88,1563508399.2830899
SIM_TRACE_LOG:0,17,5.6022,0.4977,-0.0164,0.44,6.00,9,89309.6306,False,True,5.9671,10,21.88,1563508399.361125
SIM_TRACE_LOG:0,18,5.7174,0.5006,0.0132,0.00,3.00,4,71259.8754,False,True,6.4950,11,21.88,1563508399.425759
SIM_TRACE_LOG:0,19,5.8013,0.5045,0.0295,0.44,3.00,8,61567.6215,False,True,6.8776,12,21.88,1563508399.4955459
SIM_TRACE_LOG:0,20,5.8789,0.5113,0.0551,0.22,6.00,7,62466.2224,False,True,7.2359,12,21.88,1563508399.5506108
SIM_TRACE_LOG:0,21,5.9680,0.5227,0.0922,-0.44,6.00,1,40004.0617,False,True,7.6412,13,21.88,1563508399.6270547
SIM_TRACE_LOG:0,22,6.0935,0.5352,0.0910,0.44,6.00,9,51297.9247,False,True,8.2225,14,21.88,1563508399.689689
SIM_TRACE_LOG:0,23,6.1945,0.5468,0.1005,0.44,3.00,8,53794.9543,False,True,8.6983,15,21.88,1563508399.7525225
SIM_TRACE_LOG:0,24,6.3125,0.5668,0.1414,0.22,6.00,7,59909.9705,False,True,9.2631,16,21.88,1563508399.821851
SIM_TRACE_LOG:0,25,6.3959,0.5828,0.1686,-0.22,3.00,2,69620.6658,False,True,9.6856,16,21.88,1563508399.876625
SIM_TRACE_LOG:0,26,6.4934,0.6021,0.1831,0.22,6.00,7,104111.5641,False,True,10.1943,17,21.88,1563508399.9597166
SIM_TRACE_LOG:0,27,6.5916,0.6266,0.2158,0.44,3.00,8,54892.8298,False,True,10.7300,18,21.88,1563508400.0426738
SIM_TRACE_LOG:0,28,6.7149,0.6689,0.2911,-0.44,3.00,0,28063.5181,False,True,11.4113,19,21.88,1563508400.1164913
SIM_TRACE_LOG:0,29,6.7732,0.6895,0.3093,0.22,6.00,7,46293.1490,False,True,11.6931,20,21.88,1563508400.17127
SIM_TRACE_LOG:0,30,6.8523,0.7209,0.3424,0.22,3.00,6,1794.7149,False,True,12.1648,21,21.88,1563508400.2437928
SIM_TRACE_LOG:0,31,6.9688,0.7729,0.3923,-0.44,3.00,0,1819.3461,False,True,12.8089,22,21.88,1563508400.3630335
SIM_TRACE_LOG:0,32,7.0275,0.7973,0.3911,0.00,3.00,4,2218.6975,False,True,13.1382,22,21.88,1563508400.4160743
SIM_TRACE_LOG:0,33,7.0724,0.8145,0.3821,-0.22,3.00,2,2383.0646,False,True,13.3361,23,21.88,1563508400.4683928
SIM_TRACE_LOG:0,34,7.1364,0.8379,0.3697,-0.44,6.00,1,7797.1512,False,True,13.6318,23,21.88,1563508400.5494704
SIM_TRACE_LOG:0,35,7.2154,0.8621,0.3332,0.00,3.00,4,2665.7023,False,True,13.9307,24,21.88,1563508400.6037889
SIM_TRACE_LOG:0,36,7.2824,0.8802,0.3048,0.00,6.00,5,8322.9197,False,True,14.1280,24,21.88,1563508400.6736581
SIM_TRACE_LOG:0,37,7.3563,0.8989,0.2808,0.44,6.00,9,7067.6916,False,True,14.3563,24,21.88,1563508400.7448854
SIM_TRACE_LOG:0,38,7.4474,0.9234,0.2779,-0.44,3.00,0,1137.0448,False,True,14.6435,25,21.88,1563508400.8020718
SIM_TRACE_LOG:0,39,7.5363,0.9471,0.2696,0.00,6.00,5,0.0100,False,False,14.7276,25,21.88,1563508400.8817086
SIM_TRACE_LOG:0,40,7.6273,0.9727,0.2718,0.22,6.00,7,0.0100,False,False,14.9644,25,21.88,1563508400.9617043
SIM_TRACE_LOG:0,41,7.7508,1.0111,0.2930,0.22,3.00,6,0.0100,False,False,15.2947,26,21.88,1563508401.0562809
SIM_TRACE_LOG:0,42,7.8397,1.0421,0.3133,0.22,3.00,6,0.0000,True,False,15.2947,26,21.88,1563508401.1350539
reward: 2173061.9472040334 Training> Name=main_level/agent, Worker=0, Episode=1, Total reward=2173061.94, Steps=42, Training iteration=0
I got one more step. But then it's just stuck long time and nothing happening. Also no error
Okay so it looks like it is working as intended. Can you post your robomaker.env file? Seems like your reward function is triggering an early exit due to having such a high value
@crr0004 you are amazing, car is up and running again. Thanks a lot. With the new Repo download I replaced the file and lost the changes I made before.
In case it helps someone else, this was also happening to be because the permissions of my "custom_files" folder in the minio bucket representation on the local filesystem had incorrect ownership
I updated both docker images today to be able to use the new world. Now I have the following problem: After 1 training period SageMaker is at: saved intermediate frozen graph: rl-deepracer-sagemaker/model/model_0.pb
Robomaker is stuck at: reward: 123456
for several minutes and then shows this error: Could not connect to the endpoint URL: "https://robomaker.us-east1.amazonaws.com/cancelSimulationJob"
With the old track everything was working