question about the obs return by rltask env?

@kellyguo11 thank you for the nice repo. I noticed every rltask example in OmnilsaacGymEnvs has its obs returning with self.actions. I guess it is used for agent to compute Q(s,a) value instead of V(s) value. But why not cat (s ,a) in value net in agent implementation. Because when we extract policy network from trained model to use for our downstream task, sometime using action as part of obs can not suitable.How can I fix this problem?Thank you for help! Screenshot from 2024-05-14 10-33-16

isaac-sim / OmniIsaacGymEnvs

question about the obs return by rltask env? #166