how does the sample_data work?

cocktailpeanut commented 3 weeks ago

looks like you can only generate from the three videos in the sample_data folder as seed because it needs to parse the corresponding pt files.

Couldn't find any documentation about the pt files. Is there a way to generate the pt file given an mp4 file?

Also, is this model trained just for these minecraft videos, or can you do the same thing for any video?

julian-q commented 3 weeks ago

Great question @cocktailpeanut !

In fact, you can use any actions file with any prompt video, since the model allows for general controllability. Sorry the code doesn't make this very clear :P

While we only included three sample videos so far, you can experiment with downloading your own mp4, resizing it to be (360, 640) resolution, and changing just the mp4_path to point to that video.

Couldn't find any documentation about the pt files. Is there a way to generate the pt file given an mp4 file?

The videos and actions.pt files come from OpenAI's VPT dataset, after some preprocessing. You can get more action data from the data they collected, or you can even run their IDM model on an a Minecraft gameplay mp4. I can potentially include a conversion script to get VPT actions in the format our model uses.

Also, is this model trained just for these minecraft videos, or can you do the same thing for any video?

This model was trained on all of VPT, so it should work for a wide variety of Minecraft video prompts! (And you can try even using a single Minecraft image, since all you need is a single prompt frame.)

acalasanzs commented 3 weeks ago

That sound so great, but how can i upscale and add more detail, something like high res fix? @julian-q

julian-q commented 3 weeks ago

@acalasanzs Hmm you can experiment with applying an upscaling model to the final output. But we don't provide that here for now:)

etched-ai / open-oasis

how does the sample_data work? #3