Open RPegoud opened 11 months ago
Currently, the breakout environment only terminates via the step limit condition:
done = jnp.logical_or(state.done, state.time >= self.max_steps_in_episode)
Task: verify the game logic of the environment and fix the termination issue.
Currently, the breakout environment only terminates via the step limit condition:
Task: verify the game logic of the environment and fix the termination issue.