Closed grooviiee closed 1 year ago
mappoPolicy.py에 위치한 get_actions -> self.actor 함수를 체크해볼 것
class R_Actor -> def forward
self.act = ACTLayer(
action_space, self.hidden_size, self._use_orthogonal, self._gain
)
class ACTLayer -> def forward
elif self.tuple:
actions = []
action_log_probs = []
for action_out in self.action_outs:
action_logit = action_out(x)
action = action_logit.mode() if deterministic else action_logit.sample()
action_log_prob = action_logit.log_probs(action)
actions.append(action)
action_log_probs.append(action_log_prob)
actions = torch.cat(actions, -1)
action_log_probs = torch.cat(action_log_probs, -1)
[Issue Resolved] return 값을 actions를 넣어두지 않고, action_shape값으로 넣어두고 있어서 발생한 문제였다.
def runner_collect(self, step): 에서 trainer 함수를 돌려서 get_actions를 얻어와야 하는것 아닐까?
현재 return값으로 Action Definition 값을 던져주고 있어 보인다. elif self.envs.action_space[agent_id].class.name == 'Box':
TODO: Fix below shape into Discrete or Multi Discrete
그래서 sample()을 사용해서 임시로 값을 던져주고 있었나보다.