grooviiee / python_uav

Challenge to Reinforcement learning.
0 stars 0 forks source link

Update README.md #24

Closed grooviiee closed 1 year ago

grooviiee commented 1 year ago

root cause: def step(self, action) 함수에서 obs_n 항목을 제대로 채워주고 있지 못하는 것 같다.

description: self.world.world_take_step()에서 각 action에 따라 agent의 state를 바꿔주고 _get_state 함수를 통해 얻어와야 하는데, 그러지 못하고 있는듯 하다.

    # action is coming with n_threads
    print(f"[ENV_STEP] current_step: {self.current_step}, STEP: {action}, length: {len(action)}/{len(self.action_space)}")
    self.current_step = self.current_step + 1
    obs_n = []
    reward_n = []
    done_n = []
    info_n = []

    self.agents = self.world.agents
    # set action for each agent
    for i, agent in enumerate(self.agents):
        self._set_action(i, action[0], agent, self.action_space[i])

    # advance world state
    self.world.world_take_step()  # core.step()

    # record observation for each agent
    for i, agent in enumerate(self.agents):
        obs_n.append([agent.state])
        reward_n.append([agent.reward])