Closed DanTulovsky closed 2 years ago
hi @DanTulovsky - checking on this for you. it may have to do with the fact that you are having two different agent types in self-play
Hi @DanTulovsky
I tried to reproduce with our asymmetric self-play environment by using hyperparameters similar to yours but it seemed to work just fine. My first thought was that maybe something would break if team change was less than the buffer size. I'd advise you to use a team change/save step that are greater than the buffer size. Saving a policy every 500 steps when the policy only gets updated every 20000 steps means you'll save 40 instances of the exact same policy! Unless there's a good reason for this, I'd recommend increasing this to at least the buffer size if not greater so that you only save policies once theyve been updated. Additionally, if team change is less than the buffer size, that means that we don't actually make any updates to the policy and so swapping the team yields training against the same opponent. I'd advise setting team change to be some multiple of save step.
That being said, I'm not sure its your problem. I notice this warning 2020-05-21 16:12:34 WARNING [env_manager.py:109] Agent manager was not created for behavior id Wolf?team=1.
in your trace which may be the a symptom of the issue. I'm not sure what the precise cause of this could be without more implementation details. Are both agents present in the scene upon initialization?
Thanks for looking into this. I made the numbers smaller so I can trigger the failure faster. I've just re-run it with a buffer_size of 100 and batch_size of 10, and the same problems comes up (see paste below).
I have a Prefab made for the entire training area, which is not in the scene when it starts up. The prefab (which includes the agents and everything else) is instantiated in the Start() method of GameManager.
I call Academy.Instance.EnvironmentStep();
at the very end of this method.
If it helps, here's the package itself.
% mlagents-learn configs/sheep_wolf_config2.yaml --run-id=WolfSheep13
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
▄▄▄▓▓▓▓
╓▓▓▓▓▓▓█▓▓▓▓▓
,▄▄▄m▀▀▀' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓▌
▄▓▓▓▀' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓▌▄ ▄▄▄ ,▄▄
▄▓▓▓▀ ▄▓▓▀ ▐▓▓▌ ▓▓▌ ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌ ╒▓▓▌
▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓ ▓▀ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▄ ▓▓▌
▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄ ▓▓ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▐▓▓
^█▓▓▓ ▀▓▓▄ ▐▓▓▌ ▓▓▓▓▄▓▓▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▓▄ ▓▓▓▓`
'▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '▀▀ ▐▓▓▌
▀▀▀▀▓▄▄▄ ▓▓▓▓▓▓, ▓▓▓▓▀
`▀█▓▓▓▓▓▓▓▓▓▌
¬`▀▀▀█▓
Version information:
ml-agents: 0.16.0,
ml-agents-envs: 0.16.0,
Communicator API: 1.0.0,
TensorFlow: 2.2.0
2020-05-23 14:38:57 INFO [environment.py:201] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
2020-05-23 14:39:12 INFO [environment.py:111] Connected to Unity environment with package version 1.0.0-preview and communication version 1.0.0
2020-05-23 14:39:13 INFO [environment.py:343] Connected new brain:
Sheep?team=0
2020-05-23 14:39:13.280284: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-23 14:39:13.299053: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fbbed249720 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-23 14:39:13.299085: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-05-23 14:39:13 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep13_Sheep:
summary_path: WolfSheep13_Sheep
model_path: ./models/WolfSheep13/Sheep
keep_checkpoints: 5
trainer: ppo
batch_size: 10
beta: 0.005
buffer_size: 100
epsilon: 0.2
hidden_units: 128
lambd: 0.95
learning_rate: 0.0003
learning_rate_schedule: linear
max_steps: 5.0e6
memory_size: 128
normalize: False
num_epoch: 3
num_layers: 2
self_play:
window: 10
play_against_latest_model_ratio: 0.5
save_steps: 500
swap_steps: 1000
team_change: 1000
sequence_length: 128
summary_freq: 1000
time_horizon: 64
use_recurrent: False
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.99
2020-05-23 14:39:16 INFO [environment.py:343] Connected new brain:
Wolf?team=1
2020-05-23 14:39:16 WARNING [env_manager.py:109] Agent manager was not created for behavior id Wolf?team=1.
2020-05-23 14:39:16 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep13_Wolf:
summary_path: WolfSheep13_Wolf
model_path: ./models/WolfSheep13/Wolf
keep_checkpoints: 5
trainer: ppo
batch_size: 10
beta: 0.005
buffer_size: 100
epsilon: 0.2
hidden_units: 128
lambd: 0.95
learning_rate: 0.0003
learning_rate_schedule: linear
max_steps: 5.0e6
memory_size: 128
normalize: False
num_epoch: 3
num_layers: 2
self_play:
window: 10
play_against_latest_model_ratio: 0.5
save_steps: 500
swap_steps: 1000
team_change: 1000
sequence_length: 128
summary_freq: 1000
time_horizon: 64
use_recurrent: False
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.99
2020-05-23 14:39:31 INFO [stats.py:111] WolfSheep13_Sheep: Step: 1000. Time Elapsed: 34.223 s Mean Reward: 0.394. Std of Reward: 0.876. Training.
2020-05-23 14:39:31 INFO [stats.py:116] WolfSheep13_Sheep ELO: 1204.302.
2020-05-23 14:40:31 INFO [subprocess_env_manager.py:191] UnityEnvironment worker 0: environment stopping.
2020-05-23 14:40:31 INFO [model_serialization.py:221] List of nodes to export for brain :Sheep?team=0
2020-05-23 14:40:31 INFO [model_serialization.py:223] is_continuous_control
2020-05-23 14:40:31 INFO [model_serialization.py:223] version_number
2020-05-23 14:40:31 INFO [model_serialization.py:223] memory_size
2020-05-23 14:40:31 INFO [model_serialization.py:223] action_output_shape
2020-05-23 14:40:31 INFO [model_serialization.py:223] action
Traceback (most recent call last):
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 232, in start_learning
self.reset_env_if_ready(env_manager, global_step)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 300, in reset_env_if_ready
self.end_trainer_episodes(env, lessons_incremented)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 265, in end_trainer_episodes
self._reset_env(env)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
return func(*args, **kwargs)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 154, in _reset_env
env.reset(config=sampled_reset_param)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/env_manager.py", line 67, in reset
self.first_step_infos = self._reset_env(config)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 295, in _reset_env
ew.previous_step = EnvironmentStep(ew.recv().payload, ew.worker_id, {}, {})
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 92, in recv
raise env_exception
mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond. Make sure that :
The environment does not need user interaction to launch
The Agents are linked to the appropriate Brains
The environment and the Python interface have compatible versions.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/mlagents-learn", line 11, in <module>
load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')()
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 554, in main
run_cli(parse_command_line())
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 550, in run_cli
run_training(run_seed, options)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 407, in run_training
tc.start_learning(env_manager)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
return func(*args, **kwargs)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 260, in start_learning
self._export_graph()
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 126, in _export_graph
self.trainers[brain_name].export_model(name_behavior_id)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/ghost/trainer.py", line 315, in export_model
self.trainer.export_model(brain_name)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer/trainer.py", line 134, in export_model
export_policy_model(settings, policy.graph, policy.sess)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/model_serialization.py", line 71, in export_policy_model
f.write(frozen_graph_def.SerializeToString())
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 101, in write
self._prewrite_check()
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 87, in _prewrite_check
compat.as_bytes(self.__name), compat.as_bytes(self.__mode))
tensorflow.python.framework.errors_impl.NotFoundError: ./models/WolfSheep13/Sheep/frozen_graph_def.pb; No such file or directory
dant@imacz :( [14:40:32] [~/Unity Local/ML]
Can you try making the team_change
larger? Say, 10000
?
Same result (see below).
% mlagents-learn configs/sheep_wolf_config2.yaml --run-id=WolfSheep14
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
▄▄▄▓▓▓▓
╓▓▓▓▓▓▓█▓▓▓▓▓
,▄▄▄m▀▀▀' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓▌
▄▓▓▓▀' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓▌▄ ▄▄▄ ,▄▄
▄▓▓▓▀ ▄▓▓▀ ▐▓▓▌ ▓▓▌ ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌ ╒▓▓▌
▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓ ▓▀ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▄ ▓▓▌
▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄ ▓▓ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▐▓▓
^█▓▓▓ ▀▓▓▄ ▐▓▓▌ ▓▓▓▓▄▓▓▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▓▄ ▓▓▓▓`
'▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '▀▀ ▐▓▓▌
▀▀▀▀▓▄▄▄ ▓▓▓▓▓▓, ▓▓▓▓▀
`▀█▓▓▓▓▓▓▓▓▓▌
¬`▀▀▀█▓
Version information:
ml-agents: 0.16.0,
ml-agents-envs: 0.16.0,
Communicator API: 1.0.0,
TensorFlow: 2.2.0
2020-05-24 22:48:12 INFO [environment.py:201] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
2020-05-24 22:48:50 INFO [environment.py:111] Connected to Unity environment with package version 1.0.0-preview and communication version 1.0.0
2020-05-24 22:48:51 INFO [environment.py:343] Connected new brain:
Sheep?team=0
2020-05-24 22:48:51.351221: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-24 22:48:51.368726: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f8d19b98180 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-24 22:48:51.368746: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-05-24 22:48:51 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep14_Sheep:
summary_path: WolfSheep14_Sheep
model_path: ./models/WolfSheep14/Sheep
keep_checkpoints: 5
trainer: ppo
batch_size: 10
beta: 0.005
buffer_size: 100
epsilon: 0.2
hidden_units: 128
lambd: 0.95
learning_rate: 0.0003
learning_rate_schedule: linear
max_steps: 5.0e6
memory_size: 128
normalize: False
num_epoch: 3
num_layers: 2
self_play:
window: 10
play_against_latest_model_ratio: 0.5
save_steps: 500
swap_steps: 1000
team_change: 10000
sequence_length: 128
summary_freq: 1000
time_horizon: 64
use_recurrent: False
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.99
2020-05-24 22:48:54 INFO [environment.py:343] Connected new brain:
Wolf?team=1
2020-05-24 22:48:54 WARNING [env_manager.py:109] Agent manager was not created for behavior id Wolf?team=1.
2020-05-24 22:48:54 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep14_Wolf:
summary_path: WolfSheep14_Wolf
model_path: ./models/WolfSheep14/Wolf
keep_checkpoints: 5
trainer: ppo
batch_size: 10
beta: 0.005
buffer_size: 100
epsilon: 0.2
hidden_units: 128
lambd: 0.95
learning_rate: 0.0003
learning_rate_schedule: linear
max_steps: 5.0e6
memory_size: 128
normalize: False
num_epoch: 3
num_layers: 2
self_play:
window: 10
play_against_latest_model_ratio: 0.5
save_steps: 500
swap_steps: 1000
team_change: 10000
sequence_length: 128
summary_freq: 1000
time_horizon: 64
use_recurrent: False
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.99
2020-05-24 22:49:08 INFO [stats.py:111] WolfSheep14_Sheep: Step: 1000. Time Elapsed: 55.477 s Mean Reward: 0.463. Std of Reward: 0.886. Training.
2020-05-24 22:49:08 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1204.079.
2020-05-24 22:49:19 INFO [stats.py:111] WolfSheep14_Sheep: Step: 2000. Time Elapsed: 66.273 s Mean Reward: 0.500. Std of Reward: 0.866. Training.
2020-05-24 22:49:19 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1214.286.
2020-05-24 22:49:27 INFO [stats.py:111] WolfSheep14_Sheep: Step: 3000. Time Elapsed: 74.558 s Mean Reward: 0.750. Std of Reward: 0.661. Training.
2020-05-24 22:49:27 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1225.421.
2020-05-24 22:49:35 INFO [stats.py:111] WolfSheep14_Sheep: Step: 4000. Time Elapsed: 83.123 s Mean Reward: 0.561. Std of Reward: 0.828. Training.
2020-05-24 22:49:35 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1236.238.
2020-05-24 22:49:44 INFO [stats.py:111] WolfSheep14_Sheep: Step: 5000. Time Elapsed: 91.305 s Mean Reward: 0.659. Std of Reward: 0.753. Training.
2020-05-24 22:49:44 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1243.298.
2020-05-24 22:49:52 INFO [stats.py:111] WolfSheep14_Sheep: Step: 6000. Time Elapsed: 99.732 s Mean Reward: 0.600. Std of Reward: 0.800. Training.
2020-05-24 22:49:52 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1251.554.
2020-05-24 22:50:03 INFO [stats.py:111] WolfSheep14_Sheep: Step: 7000. Time Elapsed: 110.498 s Mean Reward: 0.476. Std of Reward: 0.879. Training.
2020-05-24 22:50:03 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1258.928.
2020-05-24 22:50:11 INFO [stats.py:111] WolfSheep14_Sheep: Step: 8000. Time Elapsed: 118.667 s Mean Reward: 0.300. Std of Reward: 0.954. Training.
2020-05-24 22:50:11 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1262.214.
2020-05-24 22:50:20 INFO [stats.py:111] WolfSheep14_Sheep: Step: 9000. Time Elapsed: 127.409 s Mean Reward: 0.302. Std of Reward: 0.953. Training.
2020-05-24 22:50:20 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1263.285.
2020-05-24 22:50:28 INFO [stats.py:111] WolfSheep14_Sheep: Step: 10000. Time Elapsed: 135.982 s Mean Reward: 0.659. Std of Reward: 0.753. Training.
2020-05-24 22:50:28 INFO [stats.py:116] WolfSheep14_Sheep ELO: 1266.750.
2020-05-24 22:51:28 INFO [subprocess_env_manager.py:191] UnityEnvironment worker 0: environment stopping.
2020-05-24 22:51:28 INFO [model_serialization.py:221] List of nodes to export for brain :Sheep?team=0
2020-05-24 22:51:28 INFO [model_serialization.py:223] is_continuous_control
2020-05-24 22:51:28 INFO [model_serialization.py:223] version_number
2020-05-24 22:51:28 INFO [model_serialization.py:223] memory_size
2020-05-24 22:51:28 INFO [model_serialization.py:223] action_output_shape
2020-05-24 22:51:28 INFO [model_serialization.py:223] action
Traceback (most recent call last):
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 232, in start_learning
self.reset_env_if_ready(env_manager, global_step)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 300, in reset_env_if_ready
self.end_trainer_episodes(env, lessons_incremented)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 265, in end_trainer_episodes
self._reset_env(env)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
return func(*args, **kwargs)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 154, in _reset_env
env.reset(config=sampled_reset_param)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/env_manager.py", line 67, in reset
self.first_step_infos = self._reset_env(config)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 295, in _reset_env
ew.previous_step = EnvironmentStep(ew.recv().payload, ew.worker_id, {}, {})
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 92, in recv
raise env_exception
mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond. Make sure that :
The environment does not need user interaction to launch
The Agents are linked to the appropriate Brains
The environment and the Python interface have compatible versions.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/mlagents-learn", line 11, in <module>
load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')()
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 554, in main
run_cli(parse_command_line())
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 550, in run_cli
run_training(run_seed, options)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 407, in run_training
tc.start_learning(env_manager)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
return func(*args, **kwargs)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 260, in start_learning
self._export_graph()
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 126, in _export_graph
self.trainers[brain_name].export_model(name_behavior_id)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/ghost/trainer.py", line 315, in export_model
self.trainer.export_model(brain_name)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer/trainer.py", line 134, in export_model
export_policy_model(settings, policy.graph, policy.sess)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/model_serialization.py", line 71, in export_policy_model
f.write(frozen_graph_def.SerializeToString())
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 101, in write
self._prewrite_check()
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 87, in _prewrite_check
compat.as_bytes(self.__name), compat.as_bytes(self.__mode))
tensorflow.python.framework.errors_impl.NotFoundError: ./models/WolfSheep14/Sheep/frozen_graph_def.pb; No such file or directory
dant@imacz :( [22:51:29] [~/Unity Local/ML]
For debugging purposes, would it be possible to create a variant of this game where the prefab is present in the scene when you begin training? My intuition is that it's a peculiarity in the way the game is built that we didn't anticipate.
Alternatively, to determine if its actually self_play, you can try removing the self_play hyperparameters and then each agent will just train individually without coordination among teams.
It definitely works if I remove self_play entirely. But it seems self_play is a desirable feature, so I'd like to get it working.
I put the prefab directly into the scene, but it didn't make any difference.
% mlagents-learn configs/sheep_wolf_config2.yaml --run-id=WolfSheep16
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
▄▄▄▓▓▓▓
╓▓▓▓▓▓▓█▓▓▓▓▓
,▄▄▄m▀▀▀' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓▌
▄▓▓▓▀' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓▌▄ ▄▄▄ ,▄▄
▄▓▓▓▀ ▄▓▓▀ ▐▓▓▌ ▓▓▌ ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌ ╒▓▓▌
▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓ ▓▀ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▄ ▓▓▌
▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄ ▓▓ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▐▓▓
^█▓▓▓ ▀▓▓▄ ▐▓▓▌ ▓▓▓▓▄▓▓▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▓▄ ▓▓▓▓`
'▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '▀▀ ▐▓▓▌
▀▀▀▀▓▄▄▄ ▓▓▓▓▓▓, ▓▓▓▓▀
`▀█▓▓▓▓▓▓▓▓▓▌
¬`▀▀▀█▓
Version information:
ml-agents: 0.16.0,
ml-agents-envs: 0.16.0,
Communicator API: 1.0.0,
TensorFlow: 2.2.0
2020-05-26 21:59:35 INFO [environment.py:201] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
2020-05-26 21:59:40 INFO [environment.py:111] Connected to Unity environment with package version 1.0.2-preview and communication version 1.0.0
2020-05-26 21:59:41 INFO [environment.py:343] Connected new brain:
Sheep?team=0
2020-05-26 21:59:41.025429: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-26 21:59:41.043269: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fc61f263d30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-26 21:59:41.043289: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-05-26 21:59:41 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep16_Sheep:
summary_path: WolfSheep16_Sheep
model_path: ./models/WolfSheep16/Sheep
keep_checkpoints: 5
trainer: ppo
batch_size: 10
beta: 0.005
buffer_size: 100
epsilon: 0.2
hidden_units: 128
lambd: 0.95
learning_rate: 0.0003
learning_rate_schedule: linear
max_steps: 5.0e6
memory_size: 128
normalize: False
num_epoch: 3
num_layers: 2
self_play:
window: 10
play_against_latest_model_ratio: 0.5
save_steps: 500
swap_steps: 1000
team_change: 1000
sequence_length: 128
summary_freq: 1000
time_horizon: 64
use_recurrent: False
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.99
2020-05-26 21:59:44 INFO [environment.py:343] Connected new brain:
Wolf?team=1
2020-05-26 21:59:44 WARNING [env_manager.py:109] Agent manager was not created for behavior id Wolf?team=1.
2020-05-26 21:59:44 INFO [stats.py:130] Hyperparameters for behavior name WolfSheep16_Wolf:
summary_path: WolfSheep16_Wolf
model_path: ./models/WolfSheep16/Wolf
keep_checkpoints: 5
trainer: ppo
batch_size: 10
beta: 0.005
buffer_size: 100
epsilon: 0.2
hidden_units: 128
lambd: 0.95
learning_rate: 0.0003
learning_rate_schedule: linear
max_steps: 5.0e6
memory_size: 128
normalize: False
num_epoch: 3
num_layers: 2
self_play:
window: 10
play_against_latest_model_ratio: 0.5
save_steps: 500
swap_steps: 1000
team_change: 1000
sequence_length: 128
summary_freq: 1000
time_horizon: 64
use_recurrent: False
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.99
2020-05-26 22:02:04 INFO [stats.py:111] WolfSheep16_Sheep: Step: 1000. Time Elapsed: 149.164 s Mean Reward: 0.381. Std of Reward: 0.925. Training.
2020-05-26 22:02:04 INFO [stats.py:116] WolfSheep16_Sheep ELO: 1203.347.
2020-05-26 22:03:04 INFO [subprocess_env_manager.py:191] UnityEnvironment worker 0: environment stopping.
2020-05-26 22:03:04 INFO [model_serialization.py:221] List of nodes to export for brain :Sheep?team=0
2020-05-26 22:03:04 INFO [model_serialization.py:223] is_continuous_control
2020-05-26 22:03:04 INFO [model_serialization.py:223] version_number
2020-05-26 22:03:04 INFO [model_serialization.py:223] memory_size
2020-05-26 22:03:04 INFO [model_serialization.py:223] action_output_shape
2020-05-26 22:03:04 INFO [model_serialization.py:223] action
Traceback (most recent call last):
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 232, in start_learning
self.reset_env_if_ready(env_manager, global_step)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 300, in reset_env_if_ready
self.end_trainer_episodes(env, lessons_incremented)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 265, in end_trainer_episodes
self._reset_env(env)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
return func(*args, **kwargs)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 154, in _reset_env
env.reset(config=sampled_reset_param)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/env_manager.py", line 67, in reset
self.first_step_infos = self._reset_env(config)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 295, in _reset_env
ew.previous_step = EnvironmentStep(ew.recv().payload, ew.worker_id, {}, {})
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/subprocess_env_manager.py", line 92, in recv
raise env_exception
mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond. Make sure that :
The environment does not need user interaction to launch
The Agents are linked to the appropriate Brains
The environment and the Python interface have compatible versions.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/mlagents-learn", line 11, in <module>
load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')()
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 554, in main
run_cli(parse_command_line())
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 550, in run_cli
run_training(run_seed, options)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/learn.py", line 407, in run_training
tc.start_learning(env_manager)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
return func(*args, **kwargs)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 260, in start_learning
self._export_graph()
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 126, in _export_graph
self.trainers[brain_name].export_model(name_behavior_id)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/ghost/trainer.py", line 315, in export_model
self.trainer.export_model(brain_name)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/trainers/trainer/trainer.py", line 134, in export_model
export_policy_model(settings, policy.graph, policy.sess)
File "/Users/dant/Unity Local/ML/ml-agents/ml-agents/mlagents/model_serialization.py", line 71, in export_policy_model
f.write(frozen_graph_def.SerializeToString())
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 101, in write
self._prewrite_check()
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 87, in _prewrite_check
compat.as_bytes(self.__name), compat.as_bytes(self.__mode))
tensorflow.python.framework.errors_impl.NotFoundError: ./models/WolfSheep16/Sheep/frozen_graph_def.pb; No such file or directory
Is there any other debugging information I can provide here to try and fix this issue?
I have the same issue when training 4 agents. The other three agents are fine but on the fourth one, I get this issue.
This issue has been automatically marked as stale because it has not had activity in the last 28 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 42 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Describe the bug
(This is reproducible every time). This is a setup with turn based learning. Two agents (Sheep and Wolf) are involved, each one with their own Behavior. Space type is Discrete and I am manually controlling the stepping process. Please see the output of mlagents-learn below which includes the yaml config files. Please note that the learning process runs perfectly fine if I remove the self_learn options (while leaving everything else the same).
Specifically, perhaps the following error is related:
2020-05-21 16:23:57 WARNING [env_manager.py:109] Agent manager was not created for behavior id Wolf?team=1.
Note that the crash happens when the 'team_change' value of steps is reached. I assume it's trying to switch learning to the other team. The team that learns first is 'Sheep':
Then it should switch to "Wolf", but, I am guessing, it can't, because of the error above. There is a long pause after 1000 steps, and then the
mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond.
error comes up (notice that the error handling itself also produces a stack trace).
To Reproduce Steps to reproduce the behavior:
I can provide the .unitypackage file if that helps.
Console logs / stack traces
Environment (please complete the following information):