Modify the `MultiDiscrete` action space definition

I just realized we can use a multi-dimensional array as the nvec for gym's MultiDiscrete action space. We should re-define the action space to have shape (h, w, 7) as follows:

import gym
import numpy as np

shapes = np.ones((2, 2, 7), dtype=np.int32)
shapes[:] = [6, 4, 4, 4, 4, 7, 7]
space = gym.spaces.MultiDiscrete(shapes)

>>> shapes
array([[[6, 4, 4, 4, 4, 7, 7],
        [6, 4, 4, 4, 4, 7, 7]],

       [[6, 4, 4, 4, 4, 7, 7],
        [6, 4, 4, 4, 4, 7, 7]]], dtype=int32)
>>> shapes.shape
(2, 2, 7)
>>> space
MultiDiscrete([[[6 4 4 4 4 7 7]
  [6 4 4 4 4 7 7]]

 [[6 4 4 4 4 7 7]
  [6 4 4 4 4 7 7]]])
>>> space.sample()
array([[[2, 3, 2, 1, 0, 4, 6],
        [0, 1, 2, 2, 2, 1, 3]],

       [[5, 0, 2, 1, 0, 3, 3],
        [4, 1, 3, 3, 3, 5, 2]]])

This will simplify the various flatten and reshape logic such as

https://github.com/vwxyzjn/gym-microrts/blob/3d7a42f46efbd39a0b806388b8a445fbee48d00f/experiments/ppo_gridnet.py#L428 https://github.com/vwxyzjn/gym-microrts/blob/3d7a42f46efbd39a0b806388b8a445fbee48d00f/gym_microrts/envs/vec_env.py#L168

Farama-Foundation / MicroRTS-Py

Modify the `MultiDiscrete` action space definition #65