andyzeng / visual-pushing-grasping

Train robotic agents to learn to plan pushing and grasping actions for manipulation with deep reinforcement learning.
http://vpg.cs.princeton.edu/
BSD 2-Clause "Simplified" License
897 stars 316 forks source link

cannot reproduce the synergy between pushing and grasping #28

Open st2yang opened 5 years ago

st2yang commented 5 years ago

Hi,

Did anyone manage to reproduce the synergy between pushing and grasping?

I tried to train the model from scratch in the simulation for over 4k iterations, but the robot failed to learn the synergy. In the given test case test-10-obj-07.txt, the robot keep stupidly grasping and failing, thus the performance is way more worse than the given weight file.

I used the master branch code and didn't change anything. I followed strictly the tutorial. I am using PyTorch 0.3.1 and Ubuntu 16.04.

Please let me know if anyone made to reproduce the results.

Thanks, st2yang

st2yang commented 5 years ago

updates for those interested in this issue. I tried to train the model from scratch for many times, and these are my observations

(1) The synergy is pretty hard to reproduce (2) I never see synergy phenomenon when the training epoch is under 5k (3) I do see synergy sometimes when I train the model for up to 9k iterations

Based on my observations, I feel the synergy is much harder to train than what's claimed in the paper. So far I don't know what's the key to reproducing the synergy. I hope for a response and some guides from the author @andyzeng . Thank you in advance.

Thanks, st2yang

hemingchen commented 4 years ago

I had the same observations after training the model from scratch several times, each with 20K+ iterations. I feel like the push network was not trained sufficiently as most of the trials I observed were grasp. Grasp itself can be trained fairly well after just a few hundreds of iterations. With 5K+ iterations, there should be some level of synergy. It's strange.

@st2yang just to confirm, were you also using this command suggested in the document? python main.py --is_sim --push_rewards --experience_replay --explore_rate_decay --save_visualizations

st2yang commented 4 years ago

@hemingchen Yes I used the suggested standard flags.

About the synergy, here are my guesses (1) The push_net and grasp_net themselves are trained well, i.e., it can push and grasp objects (2) But the synergy is pretty hard to get by greedy deterministic policy, especially if the MDP is not well formulated.

HaoZhang990127 commented 3 years ago

Yes, I also have this queation. When I train 5K interations, the agent can grasp very good in trainning environment. But in the test case, it always failing to grasp instead of push. If you make push and grasp synergies, please tell me how you do it. thank you.

gsbandy24 commented 2 years ago

I had the same experience. My master thesis is focusing on applying this method to a case study in the agro-food industry and pushing is hardly utilized in testing. A thing they do not show in their paper (but have as an output of the evaulate.py) is the grasp to push ratio percentage. I think the omission of this was kind of planned to embellish the usefulness of pushing or the synergy between the two motion primitive policies.

Also, they do not test their "no reward" for pushing models on the test cases. I have run multiple (4 each) models for pushing no reward and the default version they have and each time the training graphs appear quite similar, which is in contrast to their results but it made me want see how the "no reward" model worked on the test cases. The overall grasping success rate went down due to "pushing" being replaced with "failed grasps" to create space for the gripper fingers, but completion or clearance rate went up due to decreased pushing that could (and quite often does) lead to failure modes from pushing objects out of field of view. Also constant/useless pushing when only a few objects are left in the workspace stopped which further decreased failures during testing.

I think without affordance based grasping (center of mass grasping instead) a stronger synergy would develop between pushing and grasping, but it wasnt something i had time to test out. I think what they have done is very interesting, but it is difficult to reproduce their results and is meant for exploration of robot learning with multiple actions. Maybe was not the best method to explore for an industry specific case, but you live and you learn :)

Yes, I also have this queation. When I train 5K interations, the agent can grasp very good in trainning environment. But in the test case, it always failing to grasp instead of push. If you make push and grasp synergies, please tell me how you do it. thank you.