Closed wilhem closed 5 months ago
Hi,
check https://github.com/qgallouedec/panda-gym/issues/61#issuecomment-1492878615 and https://github.com/qgallouedec/panda-gym/issues/8#issuecomment-911512499
Ping if it doesn't answer your question :)
Hi, thank you very much for your answer. I found the following paper (linked by you) very interesting. But there is a point, which makes me unsure: you wrote the following:
For Reach, there is no component linked to the task because there is no object (in this case, the terminology is a bit misleading because there is still a task to perform, but this is a special case so we left it like that)
How should I understand that statement? Is the desired_goal
array NOT containing the position of the object to be reached when using the Reach
environment?
By the way: is the list of observable features still valid? Or it has changed? The list is vor -v1
and now we have -v3
How should I understand that statement? Is the
desired_goal
array NOT containing the position of the object to be reached when using theReach
environment?
This is a design-related remark.
I wanted to dissociate the task from the robot. Thus, the criterion for achieving the task should not depend on the robot's state. For example, if the task is to push an object to a target position, the only thing that matters is whether the object's position matches the desired position. For Reach, this is special because the task involves the robot's state, but it's meant to be a special case.
For observation, desired_goal
is the gripper's target position (to be precise, it's a position, not an object to be reached), achieved_goal
, the gripper's position.
By the way: is the list of observable features still valid? Or it has changed? The list is vor
-v1
and now we have-v3
Yes, still valid
How should I understand that statement? Is the
desired_goal
array NOT containing the position of the object to be reached when using theReach
environment? For Reach, this is special because the task involves the robot's state, but it's meant to be a special case. For observation,desired_goal
is the gripper's target position (to be precise, it's a position, not an object to be reached),achieved_goal
, the gripper's position.
Sorry again. But this last statement did confuse me completely. I'm using the Reach task. In that case the gripper should reach a green object on the working table. A reward of 0.0 is granted when the difference between gripper and object less than 5 cm is. Among 'observation', 'achieved_goal', 'desired_goal', which one:
Sorry again. But this last statement did confuse me completely.
Don't worry, if you're asking the question, it's not clear enough.
I'm using the Reach task. In that case the gripper should reach a green object on the working table
Yes, two details though,
A reward of 0.0 is granted when the difference between gripper and object less than 5 cm is.
True
which one:
- is the position of the green object?
desired_goal
is the position of the gripper?
achieved_goal
are the first 3 values of 'observation' the actual position of the gripper?
Yes
I'm trying to implement a simulation environment using Panda-Gym, but I couldn't find enough information. I'm still confused about the difference among the following: observation["observation"], _observation["achievedgoal"] and _observation["desiredgoal"] . Initially I thought, that observation["observation"] contains the joint values of the pandas. But then I saw this example here, where the current_position takes the first 3 elements of the observation["observation"]. That's means, that the observation["observation"][0:3] contains the x, y, z position of the grip. Is it right? Then I have the question about the _observation["achievedgoal"]. What is this exactly _achievedgoal ? Does it contain the last achieved goal and gets updated once a new goal has been reached? Or it is the x, y, z position of the grip?
Many thanks