Up until this morning I had no issues running PPO on my MultiAgent custom env. However after a certain number of episodes, training gets 'stuck' - ie even though it still appears to be training, and iterations are increasing, no new episodes are carried out. I have tried a variety of stopping criterion, including the original, which worked just fine.
Hello, kiranikram!
I find you have use the "use_lstm" and "fcnet_hiddens" at the same time.
I guess that the agent nerual network is two fc network, then lstm, right?
[rllib]
Up until this morning I had no issues running PPO on my MultiAgent custom env. However after a certain number of episodes, training gets 'stuck' - ie even though it still appears to be training, and iterations are increasing, no new episodes are carried out. I have tried a variety of stopping criterion, including the original, which worked just fine.
ray 1.2 , python 3.7.6
config = { "env": RLlibWrapper, "env_config": {'jungle': 'EasyExit', "size": 11}, "no_done_at_end": False, "gamma": 0.9,
Use GPUs iff
RLLIB_NUM_GPUS
env var set to > 0."needs-repro-script".
I have verified my script runs in a clean environment and reproduces the issue. I have verified the issue also occurs with the latest wheels.