Open asaficontact opened 2 months ago
policy_kwargs = dict(
activation_fn=torch.nn.ReLU,net_arch=dict(pi=[512,128,64,32,16], vf=[512,128,64,32,16])
)
in model.config make a new policy_kwargs argument and pass this dict
check if entropy coefficient starts at a high point and reduces over the training i.e. exploration reducing over time as model is trained.
We need to add the following additional hyperparameter options: