Farama-Foundation / Gymnasium-Robotics

A collection of robotics simulation environments for reinforcement learning
https://robotics.farama.org/
MIT License
560 stars 89 forks source link

[Proposal] Environment should terminate when adroit hand pen drops the pen #112

Open jjshoots opened 1 year ago

jjshoots commented 1 year ago

Proposal

In AdroitHandPen, when the agent drops the pen, there is no way to recover, but the environment still does not terminate. The proposal, as in #111, is to enable environment termination on pen drop.

leonasting commented 10 months ago

Hi, I was just following the thread and wanted to check if the condition to check if the pen is dropped needs to be restored which was removed in #111 ?

            # penalty for dropping the pen
            if obj_pos[2] < 0.075:
                reward -= 5
            # removed code
               terminated = True
Kallinteris-Andreas commented 10 months ago

Hey, @leonasting basically yes, but also needs to be tested would you be interested on testing it and writing a PR, with a short report that shows terminal frames?

@jjshoots what testing had you done?

Thanks!

jjshoots commented 10 months ago

This was awhile ago and I don't quite remember, but if I recall correctly, the AdroitHand environments are no-termination environments, negative rewards are incurred in perpetuity (or until the truncation). So adding a termination signal to HandPen specifically doesn't make sense. At least that's as much discussion on this as I can remember.

leonasting commented 10 months ago

I'm interested in testing. Let me know what tests, you want me to perform. Based on the code and environment, I can infer any agent action after the pen is out of hands is redundant. In the meantime, I will capture few screenshots of terminal frames with the earlier code.

leonasting commented 10 months ago

Initially, the pen has a z-coordinate of 0.25 on the hand and the forehand has a value of 0.2. During experiments involving random movements, the z-coordinate of the pen stays between 0.2 and 0.25 while grasped. If dropped, it falls below 0.2 until hitting the table at around 0.8. adroit_pen adroit_pen_2

Kallinteris-Andreas commented 10 months ago

@leonasting it not is very clear with this camera angle, try camera_id=3 (argument in the make constructor)

leonasting commented 10 months ago

I have attached screenshot of the terminal state. adroit_pen_3 Another screenshot of the pen out of the hand. adroit_pen_4

Kallinteris-Andreas commented 10 months ago

@leonasting excellent I think it is clear that

  1. below 0.8 the pen has fallen, and the hand can not interact with it in any way

Now can you show

  1. there is no benefit to keep training after the pen has fallen

A simple ablation study should do it