SerpentAI RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 4, 8, 8], but got 5-dimensional input of size [1, 4, 100, 100, 3] instead

Martyn0324 commented 2 years ago

Operating System: Windows
Backend: GPU

Expected result

Continue training the RainbowDQN Agent, resuming training for the 4th time. This time, registering in the JSON file the correct run number and its steps.

Encountered result

Traceback (most recent call last):
  File "d:\anaconda\envs\serpent\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "d:\anaconda\envs\serpent\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "D:\Anaconda\envs\Serpent\Scripts\serpent.exe\__main__.py", line 7, in <module>
  File "d:\anaconda\envs\serpent\lib\site-packages\serpent\serpent.py", line 52, in execute
    command_function_mapping[command](*sys.argv[2:])
  File "d:\anaconda\envs\serpent\lib\site-packages\serpent\serpent.py", line 369, in play
    game.play(game_agent_class_name=game_agent_name, frame_handler=frame_handler)
  File "d:\anaconda\envs\serpent\lib\site-packages\serpent\game.py", line 202, in play
    raise e
  File "d:\anaconda\envs\serpent\lib\site-packages\serpent\game.py", line 193, in play
    game_agent.on_game_frame(game_frame, frame_handler=frame_handler, **kwargs)
  File "d:\anaconda\envs\serpent\lib\site-packages\serpent\game_agent.py", line 115, in on_game_frame
    frame_handler(game_frame, **kwargs)
  File "D:\SerpentAI\plugins\SerpentNimbleAngelGameAgentPlugin\files\serpent_NimbleAngel_game_agent.py", line 236, in handle_play
    agent_actions = self.agent_actions.generate_actions(frame_buffer)
  File "d:\anaconda\envs\serpent\lib\site-packages\serpent\machine_learning\reinforcement_learning\agents\rainbow_dqn_agent.py", line 180, in generate_actions
    self.current_action = self.agent.act(self.current_state)
  File "d:\anaconda\envs\serpent\lib\site-packages\serpent\machine_learning\reinforcement_learning\rainbow_dqn\rainbow_agent.py", line 74, in act
    return (self.online_net(state.unsqueeze(0)) * self.support).sum(2).argmax(1).item()
  File "d:\anaconda\envs\serpent\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "d:\anaconda\envs\serpent\lib\site-packages\serpent\machine_learning\reinforcement_learning\rainbow_dqn\dqn.py", line 23, in forward
    x = torch.nn.functional.relu(self.conv1(x))
  File "d:\anaconda\envs\serpent\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "d:\anaconda\envs\serpent\lib\site-packages\torch\nn\modules\conv.py", line 446, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "d:\anaconda\envs\serpent\lib\site-packages\torch\nn\modules\conv.py", line 443, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 4, 8, 8], but got 5-dimensional input of size [1, 4, 100, 100, 3] instead

Steps to reproduce

Use self.frame_transformation_pipeline_string = "RESIZE:100x100"
Insert self.game_inputs in game's API
Use RainbowDQNAgent
Run the code
Stop the code after it save the model

Change the code, adding to the setup_play function the following lines:

    self.agent_actions.current_episode = self.game_state['current_run']
    self.agent_mouse.current_episode = self.agent_actions.current_episode

    self.agent_actions.current_step = self.game_state['current_run_steps']
    self.agent_mouse.current_step = self.agent_actions.current_step

Run the code again

Note: While this problem happens when loading a model weights for RainbowDQN, this happens ALL THE TIME when using PPO Agent. Note²: This also seems to happen when you change your game inputs. This isn't the case right now. *Note³: This issue doesn't seem to affect the AI performance at all. It's purely an "aesthetic" issue.

Martyn0324 commented 2 years ago

It seems that this error will only happen when using the arguments for RainbowDQNAgent.current_episode. agent.current_step runs just fine.

After those changes are made and the model is saved, it won't work again, so it's necessary to delete the saved model weights and start the training all over again.

I checked the RainbowDQNAgent file:

    def generate_actions(self, state, **kwargs):
        frames = list()

        for game_frame in state.frames:
            frames.append(torch.tensor(torch.from_numpy(game_frame.frame), dtype=torch.float32, device=self.device))

        self.current_state = torch.stack(frames, 0)

        if self.mode == RainbowDQNAgentModes.OBSERVE and self.observe_mode == "RANDOM":
            self.current_action = random.randint(0, len(self.game_inputs[0]["inputs"]) - 1)
        elif self.mode == RainbowDQNAgentModes.OBSERVE and self.observe_mode == "MODEL":
            self.agent.reset_noise()
            self.current_action = self.agent.act(self.current_state)
        elif self.mode == RainbowDQNAgentModes.TRAIN:
            self.agent.reset_noise()
            self.current_action = self.agent.act(self.current_state)

        actions = list()

        label = self.game_inputs_mappings[0][self.current_action]
        action = self.game_inputs[0]["inputs"][label]
        value = self.game_inputs[0]["value"]

        actions.append((label, action, value))

        for action in actions:
            self.analytics_client.track(
                event_key="AGENT_ACTION",
                data={
                    "label": action[0],
                    "action": [str(a) for a in action[1]],
                    "input_value": action[2]
                }
            )

        return actions

At first, I thought there would be any relation between self.current_state and self.current_episode. But there is not.

I have no idea what the variable agent.current_episode has to do with self.current_action = self.agent.act(self.current_state).

It seems that any kind of change made to the variable self.current_episode causes this problem.

However, from all I can see in /serpent/machine_learning/reinforcement_learning/agents/rainbow_dqn_agent, this variable is only responsible for the JSON file AND when using the function observe with terminal = True AND self.mode == RainbowDQNAgentModes.TRAIN. I don't use the TRAIN mode, though(It's basically the same thing as using the OBSERVE mode)

EDIT: In fact, agent.current_episode has nothing to do with self.current_action. I don't remember why I did this association.

If you're having problems with this variable, you might be looking for this issue

Martyn0324 commented 2 years ago

EDIT: It looks like I've started this topic by trying to fix the error caused by the modification of agent.current_episode just to have a more accurate JSON file and being able to use the arguments like target_update, but ended on fixing a problem that didn't allow me to load my saved models when working with colored frames.

It seems that the input format is (n_samples, channels, height, width). I've used self.frame_transformation_pipeline_string = "RESIZE:100x100" which explains the 100, 100. That 4 seems to be provided by the FrameGrabber frame_buffer = FrameGrabber.get_frames([0, 2, 4, 6], frame_type="PIPELINE"). My input has also a 5th dimension, which is the 3 channels. That 1, though, I can't understand. Maybe the model is adding it automatically.

Using self.frame_transformation_pipeline_string = "RESIZE:100X100|GRAYSCALE" fixes this problem. It seems that RainbowDQN can only load models using grayscale game frames.

However, while grayscaled game frames works in most of the cases, it might not be that much interesting in certain games where colors are essential(Jigoku Kisetsukan is one example).

Perhaps removing that 1 when loading a model and putting that 3 in the second position might fix this problem.

Martyn0324 commented 2 years ago

In serpent.machine_learning.reinforcement_learning.rainbow_dqn.rainbow_agent.py it's used the model DQN3 from .dqn.py, which receives the argument history.

history is the sequence of states that we're analysing - in this case, the frames grabbed through Frame Grabber - . This argument is, by standard, 4, and it's used as n_channels when creating the Pytorch's convolutional 2D layers in the DQN3 model.

Pytorch is kind of confusing to me, but, as far as I can see, though the creation of those layers require only the n_channels of the input, passing the inputs to those layers require the inputs to have a format (n_samples, n_channels, height, width).

In rainbow_agent.py, Pytorch's tensor.unqueeze() function is used to add the n_samples parameter. Therefore, we pass 1 sequence of 4 frames, each frame having, in this case, 100 pixels as height and 100 as width.

However, the input format must have 4 dimensions. If we add colored frames, we'll also have 3 channels accompanying the height and width, so it'll be (1, 4, 100, 100, 3).

So, how could someone load a model without having to pass the GRAYSCALE argument to self.transformation_pipeline variable in the game plugin? Looking at some illustrations of the Conv 2D model, one can see that the input image is usually grayscale, therefore having a format of (height, width, 1), so the 3rd dimension can be "discarded" during the process.

conv2d

However, when using a colored image(usually RGB), the input won't be a single "sheet of paper", it'll also have a "thickness" of 3, which are the 3 channels from RGB.

rgb

If RainbowDQN considers each frame as being grayscaled(therefore having 1 as its channel dimensions, being like a "sheet of paper"), then, if we want to work with colored images(which are thick blocks of 3 sheets together), we'll just have to multiply those 4 frames by its 3 channels. Then, we'll have 12 as our history.

I've tried to modify SerpentAI's code in order to do this process automatically, but, in the end, I've noticed that it's way easier for the user to simply pass rainbow_kwargs = dict(history=4*3) to RainbowDQNAgent call. The only modification done was to make sure the 3rd channel from the frames would be excluded from the DQN3 input, making it a 4D input in .agents.rainbow_dqn_agent.py:

        if self.current_state.shape[3] != 1:
            self.current_state = self.current_state.view(self.agent_kwargs['history'], self.current_state.shape[1], self.current_state.shape[2])

At least, all of this seems to make sense to me. In the original DeepMind's paper about RainbowDQN, it seems they only analysed grayscaled images. I don't know if they adapted it to colored images later. In their case, it seems colors were irrelevant, and in most cases where Conv2D are used this also applies. However, in some scenarios like in certain games, knowing the colors can be essential for choosing an action(in Jigoku Kisetsukan, for example, health, power, score and aura charms might have the same size, but they got different colors).

Perhaps an alternative would be building RainbowDQN with Conv3D layers...but I have the feeling that this would only be computationally expensive

Martyn0324 / SerpentAI