Closed gowthamnatarajan closed 1 year ago
Hello, please fill the issue template completely (minimal working example and markdown code block are missing). This is the second time I'm asking you (https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/issues/142#issuecomment-1412002247), if you continue to not follow basic rules to ask for help, i'll have to close any new issue without reading.
PS: i need to add custom gym env issue template in sb3 contrib, but can find/copy it from sb3
Here is the code:
The training begins, and initially is picks the actions randomly as expected. But after a few seconds I get the following warning:
And then after this it always picks just one action over and over again and nothing happens. It no longer picks actions in random so it does not learn anything. What could be causing this? The same error happens even with the regular PPO algorithm in the sb3 package as well.