Closed edomedo00 closed 3 months ago
Adding some useful resources for this Issue:
I'm currently focused on GAIL and Behavioural Cloning with extrinsic rewards. Looking forward to use intrinsic rewards for the final agent functionality
PS. I u want to use a pre trained model, you need to include the init_path
value in the yaml configuration file. Note that you need to use the same configuration settings for the new model you want to train.
Adding the configuration file I'm using:
behaviors:
Bolas:
trainer_type: ppo
hyperparameters:
batch_size: 512
buffer_size: 4096
learning_rate: 0.0003
beta: 0.05
epsilon: 0.1
lambd: 0.95
num_epoch: 5
learning_rate_schedule: linear
network_settings:
normalize: false
hidden_units: 512
num_layers: 3
vis_encode_type: simple
reward_signals:
extrinsic:
gamma: 0.99
strength: 1.0
gail:
gamma: 0.99
strength: 0.01
network_settings:
normalize: false
hidden_units: 128
num_layers: 2
vis_encode_type: simple
learning_rate: 0.0003
use_actions: false
use_vail: false
demo_path: C:\Users\juana\Documents\Verano2024\spooringTrainingMLAgents\Assets\Demonstrations\bolasDemoRandomW.demo
keep_checkpoints: 5
max_steps: 1000000
time_horizon: 256
summary_freq: 50000
behavioral_cloning:
demo_path: C:\Users\juana\Documents\Verano2024\spooringTrainingMLAgents\Assets\Demonstrations\bolasDemoRandomW.demo
steps: 50000
strength: 1.0
samples_per_update: 0
Due to the complexity of the final training, I moved the task to the To Do section to start with simpler tasks.