Implementaion of transfomer into model

Hello Dohyeong,

I assume you're trying to replicate the AlphaStar architecture?

First, note that spatial information is still processed by a normal conv model and transformer body is applied only to a flattened list of unit information (see architecture figure below).

Second, the transformer would require a full view of the game map as opposed to only the camera viewport we currently get from PySC2. There is some WIP patch to add support for it, but no ETA on when it will be released.

Third, from your results figures I see that mean rewards are stuck around 14 which as far as I remember is about the score you would get from random actions. I'm afraid this implies your model hasn't learned much beyond the initial jump, so improving from there might be quite difficult.

What I would recommend is to try to simplify the model as much as possible - reduce state / action spaces to the bare minimum, remove non-spatial and minimap info blocks. After verifying that it works try to add transformer body as a separate layer only on player_relative feature and merge it with the final spatial state block before the final policy / value layers.

inoryy / reaver

Implementaion of transfomer into model #31