I have been training the agent using your notebook. I have also been tuning the hyperparameters using Optuna.
However, it's doesn't converge as in your YouTube video.
Could you share your hyperparameters or tensorboard log?
It would be great if you can also share which reward funciton are you using to let the agent converge.
I have been training the agent using your notebook. I have also been tuning the hyperparameters using Optuna. However, it's doesn't converge as in your YouTube video. Could you share your hyperparameters or tensorboard log? It would be great if you can also share which reward funciton are you using to let the agent converge.