sequential-dexterity / SeqDex

"Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation" code repository
https://sequential-dexterity.github.io/
Apache License 2.0
110 stars 10 forks source link

Question regarding how state reset is done in isaac in code #2

Closed StoneT2000 closed 9 months ago

StoneT2000 commented 10 months ago

I was wondering how in code you are doing state resets (or get env state and setting env state)? I was looking through the code but the task definitions are quite complex and dense :sweat_smile:. Does Isaac have an API for this or is this custom written for your own task.

Thanks (and very neat work!)

cypypccpy commented 10 months ago

Hi @StoneT2000 ,

Thank you very much. In our code, we do the env reset through "get env state and setting env state". Specifically, we have a reset buffer self.reset_buf to represent whether the environment needs to reset. And in the self.pre_physics_step function, if the number in the reset buffer is 1, then call the self.reset function which defined in the task to reset the environment, as shown in the code below:

    def pre_physics_step(self, actions):
        env_ids = self.reset_buf.nonzero(as_tuple=False).squeeze(-1)
        goal_env_ids = self.reset_goal_buf.nonzero(as_tuple=False).squeeze(-1)

        # if only goals need reset, then call set API
        if len(goal_env_ids) > 0 and len(env_ids) == 0:
            self.reset_target_pose(goal_env_ids, apply_reset=True)
        # if goals need reset in addition to other envs, call set API in reset()
        elif len(goal_env_ids) > 0:
            self.reset_target_pose(goal_env_ids)

        if len(env_ids) > 0:
            self.reset(env_ids, goal_env_ids)

As far as I know, most IsaacGym envs do the reset like this, they don't have a dedicated API.

Hope this can help you.

StoneT2000 commented 10 months ago

Thanks for the reply. I guess Isaac does not have a unified get / set state kind of API (they seem to have code stubs for it but no unified implementation). And i guess in Isaac setup it seems a little non-trivial to setup state reset mechanisms (I'm checking reset_target_pose function). Do I understand correctly that if apply_reset=False, you just reuse the same initial environment states (generated some time during initialization?)

cypypccpy commented 9 months ago

Hi @StoneT2000,

Yes, our reset is not-trivial. However, the apply_reset parameter does not actually control the reset of the lego. We define the randomization of the lego pose in the reset_idx function, so we don't use the same init. And if we use the states of the previous task as the init, it would not apply the randomization.

Hope this can help you.

StoneT2000 commented 9 months ago

I see, so this code is actually just say caching N different start states? And then each episode you choose one to start from

Then for long horizon task setup how would you transition from an end state of task 1 to start state of task 2? It seems you define multiple gym envs to do this. Would task 2s reset distribution just be task 1s goal state distribution (as achieved via successful task 1 policy rolling out from task 1 reset distribution).

cypypccpy commented 9 months ago

Hi @StoneT2000,

I see, so this code is actually just say caching N different start states? And then each episode you choose one to start from Yeah.

It seems you define multiple gym envs to do this. Would task 2s reset distribution just be task 1s goal state distribution (as achieved via successful task 1 policy rolling out from task 1 reset distribution). Also correct, and different Skill chaining algorithms have different ways of collecting init states.

StoneT2000 commented 9 months ago

thanks! I understand the code details well now