Why does VecFrameStack clear the prior frames in the stack for the step when "terminated=True"?

wkwan commented 3 months ago

❓ Question

I'm currently using VecFrameStack with my custom gym environment, which is a 1v1 game.

To debug training, I'm saving the frames in the stack whenever one player kills another (which I know is the case when reward > 9000):

class VecFrameStackSaveOnKill(VecFrameStack):

    def __init__(self, venv, n_stack, starting_timestep=0):
        super().__init__(venv, n_stack)
        self.cur_step = starting_timestep
        self.n_stack = n_stack

    def step_wait(self):
        self.stackedobs, rewards, dones, infos = super().step_wait()
        if (abs(rewards[0]) > 9000):
            for i in range(self.n_stack):
                Image.fromarray(self.stackedobs[0,:,:,i*3:i*3+3]).save(f"{args.checkpoint_folder}/img_player_killed_opponent_stacked/step_{self.cur_step}_{i}_player_killed_opponent.png")
        self.cur_step += 1
        return self.stackedobs, rewards, dones, infos

What I discovered is that in my custom environment, if I set terminated=True when 1 players kills another, the frame stack that gets saved is 3 black frames, followed by the terminal frame (n_stack is 4). I'm confused by this behavior, because I would expect that the terminal frame stack still needs the 3 prior frames, and that the stack would be cleared in the next step afterwards. Why does the terminal frame stack only include 1 frame?

I tested setting terminated=False when 1 player kills another, and in this case, the frame stack is saving all 4 frames when the reward > 9000. But I'm not sure if this how the API is intended to be used, if I want the episode to end when 1 player kills another.

In case it's helpful, I'm training with RecurrentPPO.

Checklist

[X] I have checked that there is no similar issue in the repo
[X] I have read the documentation
[X] If code there is, it is minimal and working
[X] If code there is, it is formatted using the markdown code blocks for both code and stack traces.

wkwan commented 3 months ago

@araffin sorry what's missing in the checklist, did you want to see my custom environment as well? It's a Fortnite custom game, requires some manual navigation within the game to set things up properly, but here's my env code: https://github.com/wkwan/ScrimBrain/blob/master/fortnite_env.py

araffin commented 3 months ago

sorry what's missing in the checklist, did you want to see my custom environment as well?

"If code there is, it is minimal and working", please have look at the linked issue for what minimal and working exactly means: https://github.com/DLR-RM/stable-baselines3/issues/982#issuecomment-1197044014

DLR-RM / stable-baselines3

Why does VecFrameStack clear the prior frames in the stack for the step when "terminated=True"? #1883

❓ Question

Checklist