aidudezzz / deepbots

A wrapper framework for Reinforcement Learning in the Webots robot simulator using Python 3.
https://deepbots.readthedocs.io/
GNU General Public License v3.0
236 stars 50 forks source link

There seems to be a bug in the step function of the robot_supervisor.py #80

Closed RLMilestone closed 1 year ago

RLMilestone commented 3 years ago
        if super(Supervisor, self).step(self.timestep) == -1:
            exit()

        self.apply_action(action)
        return (
            self.get_observations(),
            self.get_reward(action),
            self.is_done(),
            self.get_info(),
        )

In RL, it seems to be more natural to apply_action and then Supervisor.step(). Otherwise, you will not get correct response of your action (delay by one timestep!)

tsampazk commented 3 years ago

Hello @RLMilestone, thank you for opening this issue! This is a point of debate for the project since the very beginning. Indeed it seems more natural to step the controller after calling apply_action in the robot-supervisor scheme.

We will have to look into it in depth, because it would be nice for us to do the same in both the robot-supervisor scheme and the emitter-receiver scheme, but in the emitter-receiver scheme it might cause unforeseen issues due to the way the messages are transmitted.