Closed firepro20 closed 5 years ago
Hi @firepro20, Indeed CollectObservations is being called before OnTriggerXXX as laid out by this event timeline of MonoBehaviour. CollectObservations is called from FixedUpdate in this diagram.
It seems like you are using a trigger to let your Agent know when it’s intersecting an objects collider on the first frame (and maybe subsequent frames.). You could instead let your agent have a reference to this first object so it can initialize itself on Awake. This way, when CollectObservations is called on the first frame, your Agent will be properly initialized. And you can then use your OnTriggerEnter function to update it as you already are.
Does this make sense?
This has worked. I am initializing at Awake
private void Awake()
{
spikesList = new List<GameObject>();
GameObject spikeOne = new GameObject();
GameObject spikeTwo = new GameObject();
// Initialisation before overwriting on each subsequent frame after first
spikesList.Insert(0, spikeOne);
spikesList.Insert(1, spikeTwo);
}
Now I have a new problem not sure if it's related. I need some help understanding the output as there are no warnings or errors in Unity, however training is ending 2 seconds in with the following Anaconda message -
INFO:mlagents.envs:Hyperparameters for the PPOTrainer of brain RLBrain:
trainer: ppo
batch_size: 4096
beta: 0.005
buffer_size: 40960
epsilon: 0.2
hidden_units: 256
lambd: 0.95
learning_rate: 0.0001
learning_rate_schedule: linear
max_steps: 5.0e6
memory_size: 512
normalize: False
num_epoch: 8
num_layers: 2
time_horizon: 1024
sequence_length: 64
summary_freq: 1000
use_recurrent: False
vis_encode_type: simple
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.99
summary_path: ./summaries/RLAgent-3_RLBrain
model_path: ./models/RLAgent-3-0/RLBrain
keep_checkpoints: 5
Process Process-1:
Traceback (most recent call last):
File "c:\users\dzamm\anaconda3\envs\ml-agents\lib\multiprocessing\process.py", line 258, in _bootstrap
self.run()
File "c:\users\dzamm\anaconda3\envs\ml-agents\lib\multiprocessing\process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "c:\users\dzamm\anaconda3\envs\ml-agents\lib\site-packages\mlagents\envs\subprocess_env_manager.py", line 116, in worker
cmd.payload[0], cmd.payload[1], cmd.payload[2]
File "c:\users\dzamm\anaconda3\envs\ml-agents\lib\site-packages\mlagents\envs\environment.py", line 352, in reset
self._n_agents[_b] = len(s[_b].agents)
KeyError: 'RLBrain'
INFO:mlagents.envs:Learning was interrupted. Please wait while the graph is generated.
INFO:mlagents.envs:Saved Model
INFO:mlagents.trainers:List of nodes to export for brain :RLBrain
INFO:mlagents.trainers: is_continuous_control
INFO:mlagents.trainers: version_number
INFO:mlagents.trainers: memory_size
INFO:mlagents.trainers: action_output_shape
INFO:mlagents.trainers: action
INFO:mlagents.trainers: action_probs
INFO:tensorflow:Froze 11 variables.
INFO:tensorflow:Froze 11 variables.
Converted 11 variables to const ops.
Converting ./models/RLAgent-3-0/RLBrain/frozen_graph_def.pb to ./models/RLAgent-3-0/RLBrain.nn
IGNORED: StopGradient unknown layer
GLOBALS: 'is_continuous_control', 'version_number', 'memory_size', 'action_output_shape'
IN: 'vector_observation': [-1, 1, 1, 14] => 'main_graph_0/hidden_0/BiasAdd'
IN: 'epsilon': [-1, 1, 1, 2] => 'mul'
OUT: 'action', 'action_probs'
DONE: wrote ./models/RLAgent-3-0/RLBrain.nn file.
INFO:mlagents.trainers:Exported ./models/RLAgent-3-0/RLBrain.nn file
Not sure if this appropriate place to post the above, but it seems to be related to previous error.
I think I know what the issue is, I am reloading the whole level when my agent dies. I will try to avoid this by setting it done when agent has no health to start the simulation over again without restarting/loading the whole level.
Just an FYI, using new
to create GameObjects isn't technically supported. Although it's used as a workaround for you, I'd recommend either pulling a game object from the scene, or instantiating a prefab instead. You can read about it here:
https://docs.unity3d.com/Manual/CreateDestroyObjects.html
I acknowledge this is not technically supported, however it temporarily solves my problem to fill the observation vector with temporary data at the very start. Feel free to close this issue, thanks for the help!
Closing per your last comment. Thanks for posting.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
I have an issue only when starting my game environment. Vector Observation size has 6 less observations on start (first frame) [I am assuming CollectObservations is being called before OnTriggerEnter, where I populate my list to feed observation to the former method]
On each subsequent frame update, the size is correct and I am sending 14 observations. Until I fix this issue, the training will not be possible as an error in Anacoda is thrown
Traceback (most recent call last): File "c:\users\dzamm\anaconda3\envs\ml-agents\lib\multiprocessing\process.py", line 258, in _bootstrap self.run() File "c:\users\dzamm\anaconda3\envs\ml-agents\lib\multiprocessing\process.py", line 93, in run self._target(*self._args, **self._kwargs) File "c:\users\dzamm\anaconda3\envs\ml-agents\lib\site-packages\mlagents\envs\subprocess_env_manager.py", line 116, in worker cmd.payload[0], cmd.payload[1], cmd.payload[2] File "c:\users\dzamm\anaconda3\envs\ml-agents\lib\site-packages\mlagents\envs\environment.py", line 352, in reset self._n_agents[_b] = len(s[_b].agents) KeyError: 'RLBrain'
This is how I collect observations -
This is how I add to the Spikes list [ensuring that list has a static size of 2] -
The question is, how can I ensure that at the start I also get 14 Vector Observations as well instead of just 8 (since population is occurring after calling collectobservations?)?