aws-deepracer-community / deepracer-core

A repository binding together everything needed for DeepRacer local.
259 stars 113 forks source link

Training stops due to "botocore.exceptions.ClientError: An error occurred (RequestTimeTooSkewed)" #81

Closed daj closed 3 years ago

daj commented 4 years ago

I've been running local training on a 15" mid 2014 Macbook Pro for about a week. The last two times I started retraining from a pretrained model it stopped after making two checkpoints due to this error:

botocore.exceptions.ClientError: An error occurred (RequestTimeTooSkewed) when calling the PutObject operation: The difference between the request time and the server's time is too large.

Does anybody have tips on resolving?

Full stack track:

...
SIM_TRACE_LOG:24,38,7.4246,4.6959,1.9455,0.44,2.00,18,0.0000,True,False,10.5888,52,21.88,1574308353.1797318

reward: 101.4262420039573
Traceback (most recent call last):
  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/rollout_worker.py", line 303, in <module>
    main()
  File "/app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/rollout_worker.py", line 298, in main
    memory_backend_params = memory_backend_params
  File "/app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/rollout_worker.py", line 169, in rollout_worker
    graph_manager.act(EnvironmentEpisodes(num_steps=act_steps))
  File "/usr/local/lib/python3.5/dist-packages/rl_coach/graph_managers/graph_manager.py", line 443, in act
    result = self.top_level_manager.step(None)
  File "/usr/local/lib/python3.5/dist-packages/rl_coach/level_manager.py", line 230, in step
    env_response = self.environment.step(action_info.action)
  File "/usr/local/lib/python3.5/dist-packages/rl_coach/environments/environment.py", line 299, in step
    self._take_action(action)
  File "/usr/local/lib/python3.5/dist-packages/rl_coach/environments/gym_environment.py", line 448, in _take_action
    self.state, self.reward, self.done, self.info = self.env.step(action)
  File "/usr/local/lib/python3.5/dist-packages/gym/wrappers/time_limit.py", line 31, in step
    observation, reward, done, info = self.env.step(action)
  File "/app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/environments/deepracer_racetrack_env.py", line 566, in step
    return super().step([self.steering_angle, self.speed])
  File "/app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/environments/deepracer_racetrack_env.py", line 271, in step
    self.infer_reward_state(self.steering_angle, self.speed)
  File "/app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/environments/deepracer_racetrack_env.py", line 437, in infer_reward_state
    self.finish_episode(current_progress)
  File "/app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/environments/deepracer_racetrack_env.py", line 471, in finish_episode
    self.write_metrics_to_s3()
  File "/app/robomaker-deepracer/simulation_ws/install/sagemaker_rl_agent/lib/python3.5/site-packages/markov/environments/deepracer_racetrack_env.py", line 505, in write_metrics_to_s3
    Body=bytes(metrics_body, encoding='utf-8')
  File "/usr/local/lib/python3.5/dist-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.5/dist-packages/botocore/client.py", line 661, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (RequestTimeTooSkewed) when calling the PutObject operation: The difference between the request time and the server's time is too large.
breadcentric commented 3 years ago

Unfortunately we haven't succeeded at responding then. If it reoccurs on the new stack, hope someone reopens this issue.

We've moved on to using https://github.com/aws-deepracer-community/deepracer-for-cloud for running deepracer in local env

Closing