Open skkuai opened 2 years ago
Shouldn't the state value be subtracted when calculating the new subgoal?
I understood that subgoal represents the relative position by subtracting some of the state values from the absolute subgoal.
However, the state value is not subtracted when calculating the new subgoal as shown below (line 646~663, hiro/model.py).
def _choose_subgoal_with_noise(self, step, s, sg, n_s): if step % self.buffer_freq == 0: # Should be zero sg = self.high_con.policy_with_noise(s, self.fg) else: sg = self.subgoal_transition(s, sg, n_s) return sg ... def _choose_subgoal(self, step, s, sg, n_s): if step % self.buffer_freq == 0: sg = self.high_con.policy(s, self.fg) else: sg = self.subgoal_transition(s, sg, n_s) return sg
Shouldn't we calculate like the following?
def _choose_subgoal_with_noise(self, step, s, sg, n_s): if step % self.buffer_freq == 0: # Should be zero sg = self.high_con.policy_with_noise(s, self.fg) sg -= n_s[:sg.shape[0]] else: sg = self.subgoal_transition(s, sg, n_s) return sg ... def _choose_subgoal(self, step, s, sg, n_s): if step % self.buffer_freq == 0: sg = self.high_con.policy(s, self.fg) sg -= n_s[:sg.shape[0]] else: sg = self.subgoal_transition(s, sg, n_s) return sg
Shouldn't the state value be subtracted when calculating the new subgoal?
I understood that subgoal represents the relative position by subtracting some of the state values from the absolute subgoal.
However, the state value is not subtracted when calculating the new subgoal as shown below (line 646~663, hiro/model.py).
Shouldn't we calculate like the following?