alexcbb / Genie-Generative-Interactive-Environments

This repo aims to reproduce and open the results obtained from "Generative Interactive Environments" of Google DeepMind.
MIT License
5 stars 2 forks source link

[Feature] Mask-GiT #3

Open alexcbb opened 6 months ago

alexcbb commented 6 months ago

Feature details

The dynamics of the model are handled by a decoder-only Mask-GiT. Given a tokenized video (from VQ-VAE) and a latent action (from latent action model), it predicts the next frame.

What needs to be done