Closed Nirav-Madhani closed 3 years ago
Hi @Nirav-Madhani
Can you verify that this does not happen on one of our example environments or have you verified that it does happen? Are you seeing NaNs in the tensorboards?
It does work on example environments.
For my environment, I checked training_status.json , timers.json and tensorboard as well. No NaN anywhere.
BTW here is my config file if it can help identify any peoblem
behaviors:
TRC:
trainer_type: ppo
hyperparameters:
batch_size: 256
buffer_size: 2560
learning_rate: 0.003
beta: 0.005
epsilon: 0.2
lambd: 0.95
num_epoch: 3
learning_rate_schedule: linear
network_settings:
normalize: false
hidden_units: 128
num_layers: 2
vis_encode_type: resnet
reward_signals:
extrinsic:
gamma: 0.99
strength: 1.0
curiosity:
gamma: 0.99
strength: 0.02
network_settings:
hidden_units: 256
learning_rate: 0.0003
gail:
strength: 0.8
demo_path: Project\Assets\T-Demos\D1.demo
keep_checkpoints: 5
max_steps: 400000
time_horizon: 64
behavioral_cloning:
demo_path: Project\Assets\T-Demos\D1.demo
strength: 0.7
summary_freq: 100
The command I used:
mlagents-learn config\ppo\TRC.yaml --run-id=TRC --resume --time-scale 5
I have 4 camera input and 1 int variable as observation.
Is the behavior type
in the Behavior Parameters script set to Default? Does your agent have a Decision Requester?
Is the behavior type in the Behavior Parameters script set to Default?
Yes.
Does your agent have a Decision Requester?
No. But I am manually calling RequestDecision()
Function when required.
While Training for about 9K steps, it did once or twice sent 1 and -1 as inputs. But the problem still persists, why only 0 most of the time and 1 and -1 rarely? Why no other big value like 25 or - 39?
I don't mind sharing scripts and scenes, if required please let me know.
I reviewed code and found out the reason. I am converting input to int
value, on removing which i am getting various floating point inputs.
But one final problem that remains is why are input values so small, are they meant to range between -1 and 1. I did check other issues and discussions , specifically https://github.com/Unity-Technologies/ml-agents/issues/319 where the author is getting values like 520,630,-580.
Anyway, since this is not the main topic of issue, I am closing it
If you can help regarding getting big inputs directly rather than mathematically clamping them then it would be great!
Anyway, Thank You So Much!
It is intentional that our policies output actions in the range of -1, 1. The reason for this is keep the scale of the network weights small in order to make training more robust/stable. To get larger values, I recommend mapping the action from -1 ,1 to your desired range in C# through multiplication/addition.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Describe the bug I am using ml agents mlagents==0.27.0 , The continuous action input is always 0 while training.
To Reproduce Use Below Configuration and try training any model
Console logs / stack traces Not Applicable
Screenshots Not Applicable
Environment (please complete the following information):