Crash with self_learn enabled.

DanTulovsky commented 4 years ago

Describe the bug

(This is reproducible every time). This is a setup with turn based learning. Two agents (Sheep and Wolf) are involved, each one with their own Behavior. Space type is Discrete and I am manually controlling the stepping process. Please see the output of mlagents-learn below which includes the yaml config files. Please note that the learning process runs perfectly fine if I remove the self_learn options (while leaving everything else the same).

Specifically, perhaps the following error is related:

2020-05-21 16:23:57 WARNING [env_manager.py:109] Agent manager was not created for behavior id Wolf?team=1.

Note that the crash happens when the 'team_change' value of steps is reached. I assume it's trying to switch learning to the other team. The team that learns first is 'Sheep':

2020-05-21 16:25:23 INFO [stats.py:111] WolfSheep12_Sheep: Step: 1000. Time Elapsed: 94.040 s Mean Reward: 0.329. Std of Reward: 0.905. Training.
2020-05-21 16:25:23 INFO [stats.py:116] WolfSheep12_Sheep ELO: 1203.021.

Then it should switch to "Wolf", but, I am guessing, it can't, because of the error above. There is a long pause after 1000 steps, and then the

mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond.

error comes up (notice that the error handling itself also produces a stack trace).

To Reproduce Steps to reproduce the behavior:

I can provide the .unitypackage file if that helps.

Console logs / stack traces

% mlagents-learn configs/sheep_wolf_config2.yaml --run-id=WolfSheep11
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term

                        ▄▄▄▓▓▓▓
                   ╓▓▓▓▓▓▓█▓▓▓▓▓
              ,▄▄▄m▀▀▀'  ,▓▓▓▀▓▓▄                           ▓▓▓  ▓▓▌
            ▄▓▓▓▀'      ▄▓▓▀  ▓▓▓      ▄▄     ▄▄ ,▄▄ ▄▄▄▄   ,▄▄ ▄▓▓▌▄ ▄▄▄    ,▄▄
          ▄▓▓▓▀        ▄▓▓▀   ▐▓▓▌     ▓▓▌   ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌  ╒▓▓▌
        ▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓      ▓▀      ▓▓▌   ▐▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▌   ▐▓▓▄ ▓▓▌
        ▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄     ▓▓      ▓▓▌   ▐▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▌    ▐▓▓▐▓▓
          ^█▓▓▓        ▀▓▓▄   ▐▓▓▌     ▓▓▓▓▄▓▓▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▓▄    ▓▓▓▓`
            '▀▓▓▓▄      ^▓▓▓  ▓▓▓       └▀▀▀▀ ▀▀ ^▀▀    `▀▀ `▀▀   '▀▀    ▐▓▓▌
               ▀▀▀▀▓▄▄▄   ▓▓▓▓▓▓,                                      ▓▓▓▓▀
                   `▀█▓▓▓▓▓▓▓▓▓▌
                        ¬`▀▀▀█▓

 Version information:
  ml-agents: 0.16.0,
  ml-agents-envs: 0.16.0,
  Communicator API: 1.0.0,
  TensorFlow: 2.2.0
2020-05-21 16:12:25 INFO [environment.py:201] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
2020-05-21 16:12:30 INFO [environment.py:111] Connected to Unity environment with package version 1.0.0-preview and communication version 1.0.0
2020-05-21 16:12:31 INFO [environment.py:343] Connected new brain:
Sheep?team=0
2020-05-21 16:12:31.098324: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-21 16:12:31.125319: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fa694655d50 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-21 16:12:31.125341: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-05-21 16:12:31 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep11_Sheep:
    summary_path:   WolfSheep11_Sheep
    model_path: ./models/WolfSheep11/Sheep
    keep_checkpoints:   5
    trainer:    ppo
    batch_size: 1000
    beta:   0.005
    buffer_size:    10000
    epsilon:    0.2
    hidden_units:   128
    lambd:  0.95
    learning_rate:  0.0003
    learning_rate_schedule: linear
    max_steps:  5.0e6
    memory_size:    128
    normalize:  False
    num_epoch:  3
    num_layers: 2
    self_play:
      window:   10
      play_against_latest_model_ratio:  0.5
      save_steps:   500
      swap_steps:   1000
      team_change:  1000
    sequence_length:    128
    summary_freq:   1000
    time_horizon:   64
    use_recurrent:  False
    reward_signals:
      extrinsic:
        strength:   1.0
        gamma:  0.99
2020-05-21 16:12:34 INFO [environment.py:343] Connected new brain:
Wolf?team=1
2020-05-21 16:12:34 WARNING [env_manager.py:109] Agent manager was not created for behavior id Wolf?team=1.
2020-05-21 16:12:34 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep11_Wolf:
    summary_path:   WolfSheep11_Wolf
    model_path: ./models/WolfSheep11/Wolf
    keep_checkpoints:   5
    trainer:    ppo
    batch_size: 2000
    beta:   0.005
    buffer_size:    20000
    epsilon:    0.2
    hidden_units:   128
    lambd:  0.95
    learning_rate:  0.0003
    learning_rate_schedule: linear
    max_steps:  5.0e6
    memory_size:    128
    normalize:  False
    num_epoch:  3
    num_layers: 2
    self_play:
      window:   10
      play_against_latest_model_ratio:  0.5
      save_steps:   500
      swap_steps:   1000
      team_change:  1000
    sequence_length:    128
    summary_freq:   1000
    time_horizon:   64
    use_recurrent:  False
    reward_signals:
      extrinsic:
        strength:   1.0
        gamma:  0.99
2020-05-21 16:13:08 INFO [stats.py:111] WolfSheep11_Sheep: Step: 1000. Time Elapsed: 43.361 s Mean Reward: 0.394. Std of Reward: 0.873. Training.
2020-05-21 16:13:08 INFO [stats.py:116] WolfSheep11_Sheep ELO: 1204.088.
2020-05-21 16:14:08 INFO [subprocess_env_manager.py:191] UnityEnvironment worker 0: environment stopping.
2020-05-21 16:14:08 INFO [model_serialization.py:221] List of nodes to export for brain :Sheep?team=0
2020-05-21 16:14:08 INFO [model_serialization.py:223]   is_continuous_control
2020-05-21 16:14:08 INFO [model_serialization.py:223]   version_number
2020-05-21 16:14:08 INFO [model_serialization.py:223]   memory_size
2020-05-21 16:14:08 INFO [model_serialization.py:223]   action_output_shape
2020-05-21 16:14:08 INFO [model_serialization.py:223]   action
Traceback (most recent call last):
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 232, in start_learning
    self.reset_env_if_ready(env_manager, global_step)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 300, in reset_env_if_ready
    self.end_trainer_episodes(env, lessons_incremented)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 265, in end_trainer_episodes
    self._reset_env(env)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 154, in _reset_env
    env.reset(config=sampled_reset_param)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/env_manager.py", line 67, in reset
    self.first_step_infos = self._reset_env(config)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 295, in _reset_env
    ew.previous_step = EnvironmentStep(ew.recv().payload, ew.worker_id, {}, {})
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 92, in recv
    raise env_exception
mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond. Make sure that :
     The environment does not need user interaction to launch
     The Agents are linked to the appropriate Brains
     The environment and the Python interface have compatible versions.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/mlagents-learn", line 11, in <module>
    load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')()
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 554, in main
    run_cli(parse_command_line())
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 550, in run_cli
    run_training(run_seed, options)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 407, in run_training
    tc.start_learning(env_manager)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 260, in start_learning
    self._export_graph()
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 126, in _export_graph
    self.trainers[brain_name].export_model(name_behavior_id)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/ghost/trainer.py", line 315, in export_model
    self.trainer.export_model(brain_name)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer/trainer.py", line 134, in export_model
    export_policy_model(settings, policy.graph, policy.sess)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/model_serialization.py", line 71, in export_policy_model
    f.write(frozen_graph_def.SerializeToString())
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 101, in write
    self._prewrite_check()
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 87, in _prewrite_check
    compat.as_bytes(self.__name), compat.as_bytes(self.__mode))
tensorflow.python.framework.errors_impl.NotFoundError: ./models/WolfSheep11/Sheep/frozen_graph_def.pb; No such file or directory
dant@imacz :( [16:14:09] [~/Unity Local/ML]

Environment (please complete the following information):

Unity Version: [e.g. Unity 2019.3.14f1]
OS + version: MacOS 10.15.4
ML-Agents version: 0.16.0
TensorFlow version: 2.2.0
Environment: There is no example environment for turn-based learning.

shihzy commented 4 years ago

hi @DanTulovsky - checking on this for you. it may have to do with the fact that you are having two different agent types in self-play

andrewcoh commented 4 years ago

Hi @DanTulovsky

I tried to reproduce with our asymmetric self-play environment by using hyperparameters similar to yours but it seemed to work just fine. My first thought was that maybe something would break if team change was less than the buffer size. I'd advise you to use a team change/save step that are greater than the buffer size. Saving a policy every 500 steps when the policy only gets updated every 20000 steps means you'll save 40 instances of the exact same policy! Unless there's a good reason for this, I'd recommend increasing this to at least the buffer size if not greater so that you only save policies once theyve been updated. Additionally, if team change is less than the buffer size, that means that we don't actually make any updates to the policy and so swapping the team yields training against the same opponent. I'd advise setting team change to be some multiple of save step.

That being said, I'm not sure its your problem. I notice this warning 2020-05-21 16:12:34 WARNING [env_manager.py:109] Agent manager was not created for behavior id Wolf?team=1. in your trace which may be the a symptom of the issue. I'm not sure what the precise cause of this could be without more implementation details. Are both agents present in the scene upon initialization?

DanTulovsky commented 4 years ago

Thanks for looking into this. I made the numbers smaller so I can trigger the failure faster. I've just re-run it with a buffer_size of 100 and batch_size of 10, and the same problems comes up (see paste below).

I have a Prefab made for the entire training area, which is not in the scene when it starts up. The prefab (which includes the agents and everything else) is instantiated in the Start() method of GameManager.

I call Academy.Instance.EnvironmentStep(); at the very end of this method.

If it helps, here's the package itself.

% mlagents-learn configs/sheep_wolf_config2.yaml --run-id=WolfSheep13
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term

                        ▄▄▄▓▓▓▓
                   ╓▓▓▓▓▓▓█▓▓▓▓▓
              ,▄▄▄m▀▀▀'  ,▓▓▓▀▓▓▄                           ▓▓▓  ▓▓▌
            ▄▓▓▓▀'      ▄▓▓▀  ▓▓▓      ▄▄     ▄▄ ,▄▄ ▄▄▄▄   ,▄▄ ▄▓▓▌▄ ▄▄▄    ,▄▄
          ▄▓▓▓▀        ▄▓▓▀   ▐▓▓▌     ▓▓▌   ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌  ╒▓▓▌
        ▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓      ▓▀      ▓▓▌   ▐▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▌   ▐▓▓▄ ▓▓▌
        ▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄     ▓▓      ▓▓▌   ▐▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▌    ▐▓▓▐▓▓
          ^█▓▓▓        ▀▓▓▄   ▐▓▓▌     ▓▓▓▓▄▓▓▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▓▄    ▓▓▓▓`
            '▀▓▓▓▄      ^▓▓▓  ▓▓▓       └▀▀▀▀ ▀▀ ^▀▀    `▀▀ `▀▀   '▀▀    ▐▓▓▌
               ▀▀▀▀▓▄▄▄   ▓▓▓▓▓▓,                                      ▓▓▓▓▀
                   `▀█▓▓▓▓▓▓▓▓▓▌
                        ¬`▀▀▀█▓

 Version information:
  ml-agents: 0.16.0,
  ml-agents-envs: 0.16.0,
  Communicator API: 1.0.0,
  TensorFlow: 2.2.0
2020-05-23 14:38:57 INFO [environment.py:201] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
2020-05-23 14:39:12 INFO [environment.py:111] Connected to Unity environment with package version 1.0.0-preview and communication version 1.0.0
2020-05-23 14:39:13 INFO [environment.py:343] Connected new brain:
Sheep?team=0
2020-05-23 14:39:13.280284: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-23 14:39:13.299053: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fbbed249720 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-23 14:39:13.299085: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-05-23 14:39:13 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep13_Sheep:
    summary_path:   WolfSheep13_Sheep
    model_path: ./models/WolfSheep13/Sheep
    keep_checkpoints:   5
    trainer:    ppo
    batch_size: 10
    beta:   0.005
    buffer_size:    100
    epsilon:    0.2
    hidden_units:   128
    lambd:  0.95
    learning_rate:  0.0003
    learning_rate_schedule: linear
    max_steps:  5.0e6
    memory_size:    128
    normalize:  False
    num_epoch:  3
    num_layers: 2
    self_play:
      window:   10
      play_against_latest_model_ratio:  0.5
      save_steps:   500
      swap_steps:   1000
      team_change:  1000
    sequence_length:    128
    summary_freq:   1000
    time_horizon:   64
    use_recurrent:  False
    reward_signals:
      extrinsic:
        strength:   1.0
        gamma:  0.99
2020-05-23 14:39:16 INFO [environment.py:343] Connected new brain:
Wolf?team=1
2020-05-23 14:39:16 WARNING [env_manager.py:109] Agent manager was not created for behavior id Wolf?team=1.
2020-05-23 14:39:16 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep13_Wolf:
    summary_path:   WolfSheep13_Wolf
    model_path: ./models/WolfSheep13/Wolf
    keep_checkpoints:   5
    trainer:    ppo
    batch_size: 10
    beta:   0.005
    buffer_size:    100
    epsilon:    0.2
    hidden_units:   128
    lambd:  0.95
    learning_rate:  0.0003
    learning_rate_schedule: linear
    max_steps:  5.0e6
    memory_size:    128
    normalize:  False
    num_epoch:  3
    num_layers: 2
    self_play:
      window:   10
      play_against_latest_model_ratio:  0.5
      save_steps:   500
      swap_steps:   1000
      team_change:  1000
    sequence_length:    128
    summary_freq:   1000
    time_horizon:   64
    use_recurrent:  False
    reward_signals:
      extrinsic:
        strength:   1.0
        gamma:  0.99
2020-05-23 14:39:31 INFO [stats.py:111] WolfSheep13_Sheep: Step: 1000. Time Elapsed: 34.223 s Mean Reward: 0.394. Std of Reward: 0.876. Training.
2020-05-23 14:39:31 INFO [stats.py:116] WolfSheep13_Sheep ELO: 1204.302.
2020-05-23 14:40:31 INFO [subprocess_env_manager.py:191] UnityEnvironment worker 0: environment stopping.
2020-05-23 14:40:31 INFO [model_serialization.py:221] List of nodes to export for brain :Sheep?team=0
2020-05-23 14:40:31 INFO [model_serialization.py:223]   is_continuous_control
2020-05-23 14:40:31 INFO [model_serialization.py:223]   version_number
2020-05-23 14:40:31 INFO [model_serialization.py:223]   memory_size
2020-05-23 14:40:31 INFO [model_serialization.py:223]   action_output_shape
2020-05-23 14:40:31 INFO [model_serialization.py:223]   action
Traceback (most recent call last):
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 232, in start_learning
    self.reset_env_if_ready(env_manager, global_step)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 300, in reset_env_if_ready
    self.end_trainer_episodes(env, lessons_incremented)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 265, in end_trainer_episodes
    self._reset_env(env)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 154, in _reset_env
    env.reset(config=sampled_reset_param)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/env_manager.py", line 67, in reset
    self.first_step_infos = self._reset_env(config)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 295, in _reset_env
    ew.previous_step = EnvironmentStep(ew.recv().payload, ew.worker_id, {}, {})
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 92, in recv
    raise env_exception
mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond. Make sure that :
     The environment does not need user interaction to launch
     The Agents are linked to the appropriate Brains
     The environment and the Python interface have compatible versions.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/mlagents-learn", line 11, in <module>
    load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')()
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 554, in main
    run_cli(parse_command_line())
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 550, in run_cli
    run_training(run_seed, options)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 407, in run_training
    tc.start_learning(env_manager)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 260, in start_learning
    self._export_graph()
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 126, in _export_graph
    self.trainers[brain_name].export_model(name_behavior_id)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/ghost/trainer.py", line 315, in export_model
    self.trainer.export_model(brain_name)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer/trainer.py", line 134, in export_model
    export_policy_model(settings, policy.graph, policy.sess)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/model_serialization.py", line 71, in export_policy_model
    f.write(frozen_graph_def.SerializeToString())
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 101, in write
    self._prewrite_check()
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 87, in _prewrite_check
    compat.as_bytes(self.__name), compat.as_bytes(self.__mode))
tensorflow.python.framework.errors_impl.NotFoundError: ./models/WolfSheep13/Sheep/frozen_graph_def.pb; No such file or directory
dant@imacz :( [14:40:32] [~/Unity Local/ML]

andrewcoh commented 4 years ago

Can you try making the team_change larger? Say, 10000?

DanTulovsky commented 4 years ago

Same result (see below).

% mlagents-learn configs/sheep_wolf_config2.yaml --run-id=WolfSheep14
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term

                        ▄▄▄▓▓▓▓
                   ╓▓▓▓▓▓▓█▓▓▓▓▓
              ,▄▄▄m▀▀▀'  ,▓▓▓▀▓▓▄                           ▓▓▓  ▓▓▌
            ▄▓▓▓▀'      ▄▓▓▀  ▓▓▓      ▄▄     ▄▄ ,▄▄ ▄▄▄▄   ,▄▄ ▄▓▓▌▄ ▄▄▄    ,▄▄
          ▄▓▓▓▀        ▄▓▓▀   ▐▓▓▌     ▓▓▌   ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌  ╒▓▓▌
        ▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓      ▓▀      ▓▓▌   ▐▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▌   ▐▓▓▄ ▓▓▌
        ▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄     ▓▓      ▓▓▌   ▐▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▌    ▐▓▓▐▓▓
          ^█▓▓▓        ▀▓▓▄   ▐▓▓▌     ▓▓▓▓▄▓▓▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▓▄    ▓▓▓▓`
            '▀▓▓▓▄      ^▓▓▓  ▓▓▓       └▀▀▀▀ ▀▀ ^▀▀    `▀▀ `▀▀   '▀▀    ▐▓▓▌
               ▀▀▀▀▓▄▄▄   ▓▓▓▓▓▓,                                      ▓▓▓▓▀
                   `▀█▓▓▓▓▓▓▓▓▓▌
                        ¬`▀▀▀█▓

 Version information:
  ml-agents: 0.16.0,
  ml-agents-envs: 0.16.0,
  Communicator API: 1.0.0,
  TensorFlow: 2.2.0
2020-05-24 22:48:12 INFO [environment.py:201] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
2020-05-24 22:48:50 INFO [environment.py:111] Connected to Unity environment with package version 1.0.0-preview and communication version 1.0.0
2020-05-24 22:48:51 INFO [environment.py:343] Connected new brain:
Sheep?team=0
2020-05-24 22:48:51.351221: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-24 22:48:51.368726: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f8d19b98180 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-24 22:48:51.368746: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-05-24 22:48:51 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep14_Sheep:
    summary_path:   WolfSheep14_Sheep
    model_path: ./models/WolfSheep14/Sheep
    keep_checkpoints:   5
    trainer:    ppo
    batch_size: 10
    beta:   0.005
    buffer_size:    100
    epsilon:    0.2
    hidden_units:   128
    lambd:  0.95
    learning_rate:  0.0003
    learning_rate_schedule: linear
    max_steps:  5.0e6
    memory_size:    128
    normalize:  False
    num_epoch:  3
    num_layers: 2
    self_play:
      window:   10
      play_against_latest_model_ratio:  0.5
      save_steps:   500
      swap_steps:   1000
      team_change:  10000
    sequence_length:    128
    summary_freq:   1000
    time_horizon:   64
    use_recurrent:  False
    reward_signals:
      extrinsic:
        strength:   1.0
        gamma:  0.99
2020-05-24 22:48:54 INFO [environment.py:343] Connected new brain:
Wolf?team=1
2020-05-24 22:48:54 WARNING [env_manager.py:109] Agent manager was not created for behavior id Wolf?team=1.
2020-05-24 22:48:54 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep14_Wolf:
    summary_path:   WolfSheep14_Wolf
    model_path: ./models/WolfSheep14/Wolf
    keep_checkpoints:   5
    trainer:    ppo
    batch_size: 10
    beta:   0.005
    buffer_size:    100
    epsilon:    0.2
    hidden_units:   128
    lambd:  0.95
    learning_rate:  0.0003
    learning_rate_schedule: linear
    max_steps:  5.0e6
    memory_size:    128
    normalize:  False
    num_epoch:  3
    num_layers: 2
    self_play:
      window:   10
      play_against_latest_model_ratio:  0.5
      save_steps:   500
      swap_steps:   1000
      team_change:  10000
    sequence_length:    128
    summary_freq:   1000
    time_horizon:   64
    use_recurrent:  False
    reward_signals:
      extrinsic:
        strength:   1.0
        gamma:  0.99
2020-05-24 22:49:08 INFO [stats.py:111] WolfSheep14_Sheep: Step: 1000. Time Elapsed: 55.477 s Mean Reward: 0.463. Std of Reward: 0.886. Training.
2020-05-24 22:49:08 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1204.079.
2020-05-24 22:49:19 INFO [stats.py:111] WolfSheep14_Sheep: Step: 2000. Time Elapsed: 66.273 s Mean Reward: 0.500. Std of Reward: 0.866. Training.
2020-05-24 22:49:19 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1214.286.
2020-05-24 22:49:27 INFO [stats.py:111] WolfSheep14_Sheep: Step: 3000. Time Elapsed: 74.558 s Mean Reward: 0.750. Std of Reward: 0.661. Training.
2020-05-24 22:49:27 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1225.421.
2020-05-24 22:49:35 INFO [stats.py:111] WolfSheep14_Sheep: Step: 4000. Time Elapsed: 83.123 s Mean Reward: 0.561. Std of Reward: 0.828. Training.
2020-05-24 22:49:35 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1236.238.
2020-05-24 22:49:44 INFO [stats.py:111] WolfSheep14_Sheep: Step: 5000. Time Elapsed: 91.305 s Mean Reward: 0.659. Std of Reward: 0.753. Training.
2020-05-24 22:49:44 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1243.298.
2020-05-24 22:49:52 INFO [stats.py:111] WolfSheep14_Sheep: Step: 6000. Time Elapsed: 99.732 s Mean Reward: 0.600. Std of Reward: 0.800. Training.
2020-05-24 22:49:52 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1251.554.
2020-05-24 22:50:03 INFO [stats.py:111] WolfSheep14_Sheep: Step: 7000. Time Elapsed: 110.498 s Mean Reward: 0.476. Std of Reward: 0.879. Training.
2020-05-24 22:50:03 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1258.928.
2020-05-24 22:50:11 INFO [stats.py:111] WolfSheep14_Sheep: Step: 8000. Time Elapsed: 118.667 s Mean Reward: 0.300. Std of Reward: 0.954. Training.
2020-05-24 22:50:11 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1262.214.
2020-05-24 22:50:20 INFO [stats.py:111] WolfSheep14_Sheep: Step: 9000. Time Elapsed: 127.409 s Mean Reward: 0.302. Std of Reward: 0.953. Training.
2020-05-24 22:50:20 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1263.285.
2020-05-24 22:50:28 INFO [stats.py:111] WolfSheep14_Sheep: Step: 10000. Time Elapsed: 135.982 s Mean Reward: 0.659. Std of Reward: 0.753. Training.
2020-05-24 22:50:28 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1266.750.
2020-05-24 22:51:28 INFO [subprocess_env_manager.py:191] UnityEnvironment worker 0: environment stopping.
2020-05-24 22:51:28 INFO [model_serialization.py:221] List of nodes to export for brain :Sheep?team=0
2020-05-24 22:51:28 INFO [model_serialization.py:223]   is_continuous_control
2020-05-24 22:51:28 INFO [model_serialization.py:223]   version_number
2020-05-24 22:51:28 INFO [model_serialization.py:223]   memory_size
2020-05-24 22:51:28 INFO [model_serialization.py:223]   action_output_shape
2020-05-24 22:51:28 INFO [model_serialization.py:223]   action
Traceback (most recent call last):
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 232, in start_learning
    self.reset_env_if_ready(env_manager, global_step)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 300, in reset_env_if_ready
    self.end_trainer_episodes(env, lessons_incremented)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 265, in end_trainer_episodes
    self._reset_env(env)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 154, in _reset_env
    env.reset(config=sampled_reset_param)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/env_manager.py", line 67, in reset
    self.first_step_infos = self._reset_env(config)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 295, in _reset_env
    ew.previous_step = EnvironmentStep(ew.recv().payload, ew.worker_id, {}, {})
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 92, in recv
    raise env_exception
mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond. Make sure that :
     The environment does not need user interaction to launch
     The Agents are linked to the appropriate Brains
     The environment and the Python interface have compatible versions.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/mlagents-learn", line 11, in <module>
    load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')()
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 554, in main
    run_cli(parse_command_line())
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 550, in run_cli
    run_training(run_seed, options)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 407, in run_training
    tc.start_learning(env_manager)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 260, in start_learning
    self._export_graph()
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 126, in _export_graph
    self.trainers[brain_name].export_model(name_behavior_id)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/ghost/trainer.py", line 315, in export_model
    self.trainer.export_model(brain_name)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer/trainer.py", line 134, in export_model
    export_policy_model(settings, policy.graph, policy.sess)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/model_serialization.py", line 71, in export_policy_model
    f.write(frozen_graph_def.SerializeToString())
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 101, in write
    self._prewrite_check()
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 87, in _prewrite_check
    compat.as_bytes(self.__name), compat.as_bytes(self.__mode))
tensorflow.python.framework.errors_impl.NotFoundError: ./models/WolfSheep14/Sheep/frozen_graph_def.pb; No such file or directory
dant@imacz :( [22:51:29] [~/Unity Local/ML]

andrewcoh commented 4 years ago

For debugging purposes, would it be possible to create a variant of this game where the prefab is present in the scene when you begin training? My intuition is that it's a peculiarity in the way the game is built that we didn't anticipate.

Alternatively, to determine if its actually self_play, you can try removing the self_play hyperparameters and then each agent will just train individually without coordination among teams.

DanTulovsky commented 4 years ago

It definitely works if I remove self_play entirely. But it seems self_play is a desirable feature, so I'd like to get it working.

I put the prefab directly into the scene, but it didn't make any difference.

% mlagents-learn configs/sheep_wolf_config2.yaml --run-id=WolfSheep16
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term

                        ▄▄▄▓▓▓▓
                   ╓▓▓▓▓▓▓█▓▓▓▓▓
              ,▄▄▄m▀▀▀'  ,▓▓▓▀▓▓▄                           ▓▓▓  ▓▓▌
            ▄▓▓▓▀'      ▄▓▓▀  ▓▓▓      ▄▄     ▄▄ ,▄▄ ▄▄▄▄   ,▄▄ ▄▓▓▌▄ ▄▄▄    ,▄▄
          ▄▓▓▓▀        ▄▓▓▀   ▐▓▓▌     ▓▓▌   ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌  ╒▓▓▌
        ▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓      ▓▀      ▓▓▌   ▐▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▌   ▐▓▓▄ ▓▓▌
        ▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄     ▓▓      ▓▓▌   ▐▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▌    ▐▓▓▐▓▓
          ^█▓▓▓        ▀▓▓▄   ▐▓▓▌     ▓▓▓▓▄▓▓▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▓▄    ▓▓▓▓`
            '▀▓▓▓▄      ^▓▓▓  ▓▓▓       └▀▀▀▀ ▀▀ ^▀▀    `▀▀ `▀▀   '▀▀    ▐▓▓▌
               ▀▀▀▀▓▄▄▄   ▓▓▓▓▓▓,                                      ▓▓▓▓▀
                   `▀█▓▓▓▓▓▓▓▓▓▌
                        ¬`▀▀▀█▓

 Version information:
  ml-agents: 0.16.0,
  ml-agents-envs: 0.16.0,
  Communicator API: 1.0.0,
  TensorFlow: 2.2.0
2020-05-26 21:59:35 INFO [environment.py:201] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
2020-05-26 21:59:40 INFO [environment.py:111] Connected to Unity environment with package version 1.0.2-preview and communication version 1.0.0
2020-05-26 21:59:41 INFO [environment.py:343] Connected new brain:
Sheep?team=0
2020-05-26 21:59:41.025429: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-26 21:59:41.043269: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fc61f263d30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-26 21:59:41.043289: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-05-26 21:59:41 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep16_Sheep:
    summary_path:   WolfSheep16_Sheep
    model_path: ./models/WolfSheep16/Sheep
    keep_checkpoints:   5
    trainer:    ppo
    batch_size: 10
    beta:   0.005
    buffer_size:    100
    epsilon:    0.2
    hidden_units:   128
    lambd:  0.95
    learning_rate:  0.0003
    learning_rate_schedule: linear
    max_steps:  5.0e6
    memory_size:    128
    normalize:  False
    num_epoch:  3
    num_layers: 2
    self_play:
      window:   10
      play_against_latest_model_ratio:  0.5
      save_steps:   500
      swap_steps:   1000
      team_change:  1000
    sequence_length:    128
    summary_freq:   1000
    time_horizon:   64
    use_recurrent:  False
    reward_signals:
      extrinsic:
        strength:   1.0
        gamma:  0.99
2020-05-26 21:59:44 INFO [environment.py:343] Connected new brain:
Wolf?team=1
2020-05-26 21:59:44 WARNING [env_manager.py:109] Agent manager was not created for behavior id Wolf?team=1.
2020-05-26 21:59:44 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep16_Wolf:
    summary_path:   WolfSheep16_Wolf
    model_path: ./models/WolfSheep16/Wolf
    keep_checkpoints:   5
    trainer:    ppo
    batch_size: 10
    beta:   0.005
    buffer_size:    100
    epsilon:    0.2
    hidden_units:   128
    lambd:  0.95
    learning_rate:  0.0003
    learning_rate_schedule: linear
    max_steps:  5.0e6
    memory_size:    128
    normalize:  False
    num_epoch:  3
    num_layers: 2
    self_play:
      window:   10
      play_against_latest_model_ratio:  0.5
      save_steps:   500
      swap_steps:   1000
      team_change:  1000
    sequence_length:    128
    summary_freq:   1000
    time_horizon:   64
    use_recurrent:  False
    reward_signals:
      extrinsic:
        strength:   1.0
        gamma:  0.99
2020-05-26 22:02:04 INFO [stats.py:111] WolfSheep16_Sheep: Step: 1000. Time Elapsed: 149.164 s Mean Reward: 0.381. Std of Reward: 0.925. Training.
2020-05-26 22:02:04 INFO [stats.py:116] WolfSheep16_Sheep ELO: 1203.347.
2020-05-26 22:03:04 INFO [subprocess_env_manager.py:191] UnityEnvironment worker 0: environment stopping.
2020-05-26 22:03:04 INFO [model_serialization.py:221] List of nodes to export for brain :Sheep?team=0
2020-05-26 22:03:04 INFO [model_serialization.py:223]   is_continuous_control
2020-05-26 22:03:04 INFO [model_serialization.py:223]   version_number
2020-05-26 22:03:04 INFO [model_serialization.py:223]   memory_size
2020-05-26 22:03:04 INFO [model_serialization.py:223]   action_output_shape
2020-05-26 22:03:04 INFO [model_serialization.py:223]   action
Traceback (most recent call last):
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 232, in start_learning
    self.reset_env_if_ready(env_manager, global_step)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 300, in reset_env_if_ready
    self.end_trainer_episodes(env, lessons_incremented)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 265, in end_trainer_episodes
    self._reset_env(env)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 154, in _reset_env
    env.reset(config=sampled_reset_param)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/env_manager.py", line 67, in reset
    self.first_step_infos = self._reset_env(config)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 295, in _reset_env
    ew.previous_step = EnvironmentStep(ew.recv().payload, ew.worker_id, {}, {})
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 92, in recv
    raise env_exception
mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond. Make sure that :
     The environment does not need user interaction to launch
     The Agents are linked to the appropriate Brains
     The environment and the Python interface have compatible versions.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/mlagents-learn", line 11, in <module>
    load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')()
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 554, in main
    run_cli(parse_command_line())
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 550, in run_cli
    run_training(run_seed, options)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 407, in run_training
    tc.start_learning(env_manager)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 260, in start_learning
    self._export_graph()
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 126, in _export_graph
    self.trainers[brain_name].export_model(name_behavior_id)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/ghost/trainer.py", line 315, in export_model
    self.trainer.export_model(brain_name)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer/trainer.py", line 134, in export_model
    export_policy_model(settings, policy.graph, policy.sess)
  File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/model_serialization.py", line 71, in export_policy_model
    f.write(frozen_graph_def.SerializeToString())
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 101, in write
    self._prewrite_check()
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 87, in _prewrite_check
    compat.as_bytes(self.__name), compat.as_bytes(self.__mode))
tensorflow.python.framework.errors_impl.NotFoundError: ./models/WolfSheep16/Sheep/frozen_graph_def.pb; No such file or directory

DanTulovsky commented 4 years ago

Is there any other debugging information I can provide here to try and fix this issue?

ilaydacet commented 4 years ago

I have the same issue when training 4 agents. The other three agents are fine but on the fourth one, I get this issue.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had activity in the last 28 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically closed because it has not had activity in the last 42 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.

github-actions[bot] commented 2 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Unity-Technologies / ml-agents

Crash with self_learn enabled. #4005