Open gijskoning opened 3 years ago
Table 4. PPO hyperparameters.
Parameter | value |
---|---|
learning rate | 2.5e-4 |
discount y | 0.99 |
GAE delta | 0.95 |
memory size | 128 |
batch size | 32 |
num. epoch | 3 |
num. workers | 8 |
entropy B | 1.0e-2 |
clip e | 0.1 |
value coeff. c1 | 1 |
Might be good to first start with only the FNN. Also found out that it is better to start working with the Traffic Control environment since that model is a lot smaller.