Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.19k stars 4.16k forks source link

Training Crashes mid training code warning. #5312

Closed graybob closed 3 years ago

graybob commented 3 years ago

Describe the bug

Training Starts and then crashes after few steps with below error. ML Agent 1.0.7

2021-04-23 21:45:09 INFO [stats.py:145] Hyperparameters for behavior name beh1:
        trainer_type:   ppo
        hyperparameters:
          batch_size:   1024
          buffer_size:  10240
          learning_rate:        0.0003
          beta: 0.005
          epsilon:      0.2
          lambd:        0.95
          num_epoch:    4
          learning_rate_schedule:       linear
        network_settings:
          normalize:    False
          hidden_units: 98
          num_layers:   2
          vis_encode_type:      simple
          memory:       None
        reward_signals:
          extrinsic:
            gamma:      0.99
            strength:   1.0
          curiosity:
            gamma:      0.99
            strength:   0.02
            encoding_size:      256
            learning_rate:      0.0003
        init_path:      None
        keep_checkpoints:       5
        checkpoint_interval:    500000
        max_steps:      45000000
        time_horizon:   64
        summary_freq:   10000
        threaded:       True
        self_play:      None
        behavioral_cloning:     None
        framework:      pytorch
2021-04-23 21:48:06 INFO [stats.py:139] beh1. Step: 10000. Time Elapsed: 193.651 s. Mean Reward: 0.125. Std of Reward: 0.227. Training.
c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\site-packages\mlagents\trainers\torch\utils.py:267: UserWarning: This overload of nonzero is deprecated:
        nonzero()
Consider using one of the following signatures instead:
        nonzero(*, bool as_tuple) (Triggered internally at  ..\torch\csrc\utils\python_arg_parser.cpp:882.)
  res += [data[(partitions == i).nonzero().squeeze(1)]]
2021-04-23 21:51:09 INFO [stats.py:139] beh1. Step: 20000. Time Elapsed: 376.560 s. Mean Reward: 0.107. Std of Reward: 0.226. Training.
2021-04-23 21:53:45 INFO [subprocess_env_manager.py:184] UnityEnvironment worker 0: environment stopping.
2021-04-23 21:53:45 INFO [model_serialization.py:93] Converting to results\PPO01\beh1\beh1-24573.onnx
c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\site-packages\mlagents\trainers\torch\networks.py:352: TracerWarning: torch.Tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  torch.Tensor([self.network_body.memory_size]),
2021-04-23 21:53:45 INFO [model_serialization.py:105] Exported results\PPO01\beh1\beh1-24573.onnx
2021-04-23 21:53:45 INFO [torch_model_saver.py:116] Copied results\PPO01\beh1\beh1-24573.onnx to results\PPO01\beh1.onnx.
2021-04-23 21:53:45 INFO [trainer_controller.py:85] Saved Model
2021-04-23 21:54:45 INFO [environment.py:406] Environment timed out shutting down. Killing...
Traceback (most recent call last):
  File "c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\runpy.py", line 193, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\DELL\Anaconda3\envs\ml-agents_rel1\Scripts\mlagents-learn.exe\__main__.py", line 7, in <module>
  File "c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\site-packages\mlagents\trainers\learn.py", line 280, in main
    run_cli(parse_command_line())
  File "c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\site-packages\mlagents\trainers\learn.py", line 276, in run_cli
    run_training(run_seed, options)
  File "c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\site-packages\mlagents\trainers\learn.py", line 153, in run_training
    tc.start_learning(env_manager)
  File "c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 176, in start_learning
    n_steps = self.advance(env_manager)
  File "c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\site-packages\mlagents\trainers\trainer_controller.py", line 234, in advance
    new_step_infos = env_manager.get_steps()
  File "c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\site-packages\mlagents\trainers\env_manager.py", line 112, in get_steps
    new_step_infos = self._step()
  File "c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 267, in _step
    raise env_exception
mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond. Make sure that :
         The environment does not need user interaction to launch
         The Agents' Behavior Parameters > Behavior Type is set to "Default"
         The environment and the Python interface have compatible versions.
chriselion commented 3 years ago

There's not enough information to go on here. Can you try connecting with one of our example scenes (3DBall is the easiest)? Do you have at least one Agent added to an active GameObject, and does it have the Behavior Type set to "Default"? Are there any other messages in the editor console or player log?

graybob commented 3 years ago

Hi ,

I don't think its instillation issue , the 3DBall runs just fine.

2021-04-24 08:33:01 INFO [stats.py:139] 3DBall. Step: 50000. Time Elapsed: 227.826 s. Mean Reward: 1.331. Std of Reward: 0.829. Training. 2021-04-24 08:36:46 INFO [stats.py:139] 3DBall. Step: 100000. Time Elapsed: 452.419 s. Mean Reward: 2.272. Std of Reward: 1.512. Training. 2021-04-24 08:40:14 INFO [stats.py:139] 3DBall. Step: 150000. Time Elapsed: 660.560 s. Mean Reward: 4.112. Std of Reward: 3.658. Training.

Only training my agent causes the issue , that too after some successful >5000 steps of training , I can see clearly that training started in GUI

Yes I have one agent with Behavior type set to "Default" , Yes I have Decision Requestor.

No , there is no other Information printed other than mentioned in above/below comment after this training crashes and exits.

c:\users\dell\anaconda3\envs\ml-agents_rel1\lib\site-packages\mlagents\trainers\torch\utils.py:267: UserWarning: This overload of nonzero is deprecated: nonzero() Consider using one of the following signatures instead: nonzero(*, bool as_tuple) (Triggered internally at ..\torch\csrc\utils\python_arg_parser.cpp:882.) res += [data[(partitions == i).nonzero().squeeze(1)]]

Let me know if I can provide any specific information from any log.

graybob commented 3 years ago

Hi ,

Sorry ,This is a issue from my side , the python package was way ahead of 0.16.1 recommended for release 1.0.7.

Please close this issue.

Thanks

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.