Open StoneT2000 opened 6 days ago
I ran a second seed which is now doing slightly better, although comparing sample efficiency with the original paper it seems to still be behind. At 120M steps it is around 600 reward but paper shows 700.
I also tried running PPO just now but it seems to fail completely using
python scripts/train_baselines.py algo=ppo_algo task=FrankaCubeStack isaac_param=True
Is there anything else I might need to add to test the baselines?
Hi, thank you for open sourcing PQL! I am trying to reproduce the results on one GPU (4090) and am finding that the algorithm eval return doesn't go past the 400 ish mark
I ran the following command to test PQL (not PQL-D)
I am using
ca7a4fb762f9581e39cc2aab644f18a83d6ab0ba
as the IsaacGymEnvs git commit and am using the Isaac Gym 4 release, as well as the latest git commit of the pql repo. I don't think this should be a reward scale issue since I modified the code to print the success rate of successfully stacking the cube and it is mostly close to 0.Any idea?