Unable to pick the Block

Hello @Ameyapores, thank you. This is a tricky topic: as you can see from the literature, learning to pick up a block it's an hard task for a robot, and the problem is generally tackled with hierarchical RL techniques, imitation learning by examples, hindsight experience replay or other methods. I'm not surprised so that the robot doesn't learn efficiently to pick up the block since the general algorithm is not optimized for that. The loss functions of actors and critics can have quite different shape with respect to usual supervised learning since the data distribution is changing often (when the robot learns a new behaviour it reaches different states and does different actions). If the reward goes up you should not worry too much, the hyperparameters should already be optimized for the task. Normalizing inputs is always a good idea in deep learning. I tend to always do that and it generally pays off. Normalizing the rewards is also often a good idea.

normandipalo / curiosity-robot

Unable to pick the Block #1