davidhershey / feudal_networks

An implementation of FeUdal Networks for Hierarchical Reinforcement Learning as published : https://arxiv.org/abs/1703.01161
MIT License
178 stars 46 forks source link

Feudal policy on PongDeterministic-v4 #8

Open ThanosRoidis opened 6 years ago

ThanosRoidis commented 6 years ago

I have been trying to get the 'feudal' policy to work on the 'PongDeterministic-v4' environment but I had no luck. The 'lstm' policy seems to work for me, but If I change it to 'feudal' the episode rewards do not increase even after of 8 hours of training with 1 worker, they are stuck to -20, both on the 'master' branch and the 'dilated_fix' branch.

I saw the other issues mentioning that it doesn't achieve the benchmarks from the paper, but is it supposed to work on pong at least? or am I doing something wrong?