andyzeng / visual-pushing-grasping

Train robotic agents to learn to plan pushing and grasping actions for manipulation with deep reinforcement learning.
http://vpg.cs.princeton.edu/
BSD 2-Clause "Simplified" License
883 stars 314 forks source link

the initial value in Net and define the best point #47

Closed ChenyangRan closed 4 years ago

ChenyangRan commented 4 years ago

Hi, I have read your code and articles, but one problem has been bothering me. why is the random initial value so excellent? and Why do you think this point can achieve action?

I read another of your articles--Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching Andy. In ARC, the best point can be predicted by hand-marked labels. I thought that the definition of best point for VPG was predefined from this article, but it did not use the results of ARC , or I did not see it.

The FC will be trained as a label by a custom reward, but the random initial weight will get excellent predictions. It seems that the initial value is random. Has anyone been able to answer this question?