MineDojo / MineCLIP

Foundation Model for MineDojo
MIT License
226 stars 30 forks source link

Potential issues with MineCLIP weights #6

Open nickioan opened 1 year ago

nickioan commented 1 year ago

I have loaded the weights on the MineCLIP model for both the attn and avg variants to observe how the generated reward varies when parsing a video from a user playing MineCraft following one of the provided tasks. It appears that the generated reward remains stagnant throughout the video regardless of the text prompt, in addition when using randomly generated frames or zeroed frames the output is still very similar.

I am fairly certain that my video loading and weight loading process is in accordance to the existing documentation, so I am wondering if the current uploaded weights for either variants is incorrect.

yunfanjiang commented 1 year ago

Hey there, thanks for your interest and raising the issue. Could you share more information such as the script you used so we can look into?

nickioan commented 1 year ago

Thank you so much for your timely response below I have included the main code to get the reward form our recorded video. Note we divide the reward by the logit_scale parameter to get the cosine similarity

Main code ``` import os import torch import torchvision from torch.utils.data import DataLoader import wandb import yaml from video_utils import (VideoDataset, init_wandb, load_mineclip, example_read_video) STREAM = 'video' if __name__ == '__main__': import argparse # get the task name from the command line parser = argparse.ArgumentParser() parser.add_argument('--task', type=str, default='task1') args = parser.parse_args() TASK = args.task with open("tasks_conf.yaml", "r") as stream: try: conf = yaml.safe_load(stream)[TASK] except yaml.YAMLError as exc: print(exc) PATH = os.path.join(os.getcwd(), 'recordings', '{}.mp4'.format(conf['name'])) prompts = conf['prompts'] init_wandb(task_id=conf['name']) prompt_ids = ["Sub-Task {}".format(x+1) for x in range(len(prompts))] device = torch.device("cuda" if torch.cuda.is_available() else "cpu") video = torchvision.io.VideoReader(PATH, STREAM) vf, info, meta = example_read_video(video) dataset = VideoDataset(vf) loader = DataLoader( dataset, batch_size=16, shuffle=False ) model = load_mineclip(device) with torch.no_grad(): prompt_feats = model.encode_text(prompts) for data in loader: data = torch.unsqueeze(data, dim=0).to(device) with torch.no_grad(): reward, _ = model(data, text_tokens=prompt_feats, is_video_features=False) reward /= torch.exp(model.clip_model.logit_scale) log_data = dict(zip(prompt_ids, reward[0].cpu().numpy())) wandb.log(log_data) ```

In addition here are the main helper functions to load MineCLIP and the recorded video

Helper functions ``` import torch import itertools import torchvision import torchvision.transforms as T from torch.utils.data import Dataset import hashlib from omegaconf import OmegaConf from mineclip import MineCLIP def load_mineclip(device): # Initialize MineClip cfg = OmegaConf.load("clip_conf_simple.yaml") OmegaConf.set_struct(cfg, False) ckpt = cfg.pop("ckpt") OmegaConf.set_struct(cfg, True) assert ( hashlib.md5(open(ckpt.path, "rb").read()).hexdigest() == ckpt.checksum ), "broken ckpt" model = MineCLIP(**cfg).to(device) model.load_ckpt(ckpt.path,strict=True) model.eval() return model def example_read_video(video_object, start=0, end=None): transform = T.Resize((160,256)) if end is None: end = float("inf") if end < start: raise ValueError( "end time should be larger than start time, got " f"start time={start} and end time={end}" ) video_frames = torch.empty(0) video_pts = [] video_object.set_current_stream("video") frames = [] for frame in itertools.takewhile(lambda x: x['pts'] <= end, video_object.seek(start)): frames.append(transform(frame['data'])) video_pts.append(frame['pts']) if len(frames) > 0: video_frames = torch.stack(frames, 0) return video_frames, video_pts, video_object.get_metadata() ```

We made our own configuration file for MineCLIP to avoid using Hydra and the weights were downloaded as pointed in the README file

MineCLIP config ``` arch: "vit_base_p16_fz.v2.t2" hidden_dim: 512 image_feature_dim: 512 mlp_adapter_spec: "v0-2.t0" pool_type: "attn.d2.nh8.glusw" # filled by variant resolution: [160, 256] ckpt: path: "mc_weights/attn.pth" # filled by users checksum: "b5ece9198337cfd117a3bfbd921e56da" ```

This is the link with some of the recorded videos

And we made a configuration file for each task with task prompts generated by GPT3

Task Configurations ``` task1: name: GPT-Creative-331 prompts: - The first thing you need to do is find a tree. You can do this by exploring your world or by using a seed that is known to have tall trees. - Once you have found a tree, you need to measure its height. This can be done by using a ruler or by measuring the block height of the tree. - After you have measured the height of the tree, you need to find a spot to build your house. The best spot to build your house is on top of the tree. - Once you have found a spot to build your house, you need to start building it. You can use any type of block to build your house. - After you have built your house, you need to decorate it. You can use furniture, paintings, and other items to decorate your house. - After you have decorated your house, you need to move in. You can do this by placing a bed in your house. - After you have moved in, you need to enjoy your new home. task2: name: David-Creative-331 prompts: - The first thing you need to do is find a tree. You can do this by exploring your world or by using a seed that is known to have tall trees. - Once you have found a tree, you need to measure its height. This can be done by using a ruler or by measuring the block height of the tree. - After you have measured the height of the tree, you need to find a spot to build your house. The best spot to build your house is on top of the tree. - Once you have found a spot to build your house, you need to start building it. You can use any type of block to build your house. - After you have built your house, you need to decorate it. You can use furniture, paintings, and other items to decorate your house. - After you have decorated your house, you need to move in. You can do this by placing a bed in your house. - After you have moved in, you need to enjoy your new home. task3: name: GPT-create-cookie prompts: - Find a tree. - Cut the tree down with an axe to obtain wood. - Craft a crafting table with the wood. - Place the crafting table down and open it. - Find the cookie recipe in the crafting menu and craft the cookies. task4: name: David-create-cookie prompts: - Find a tree. - Cut the tree down with an axe to obtain wood. - Craft a crafting table with the wood. - Place the crafting table down and open it. - Find the cookie recipe in the crafting menu and craft the cookies. task5: name: GPT-creative-118 prompts: - Find a suitable location in your house for the trap. A small room or closet would be ideal. - Place a pressure plate or tripwire at the entrance to the room. - Place a few TNT blocks inside the room. - When the skeleton enters the room, it will trigger the pressure plate or tripwire, causing the TNT to explode and kill the skeleton. task6: name: David-creative-118 prompts: - Find a suitable location in your house for the trap. A small room or closet would be ideal. - Place a pressure plate or tripwire at the entrance to the room. - Place a few TNT blocks inside the room. - When the skeleton enters the room, it will trigger the pressure plate or tripwire, causing the TNT to explode and kill the skeleton. task7: name: David-creative-481 prompts: - Find the tallest mountain in Minecraft. This might take some time, depending on the size of your world. - Once you've found the tallest mountain, climb to the top. - When you're at the top, find a suitable place to jump off. Make sure there's nothing in the way that could block your jump, and that the area below is clear. - When you're ready, take a running jump off the edge of the mountain. - As you're falling, deploy your parachute. - Enjoy the view as you float down to the ground! task8: name: GPT-creative-481 prompts: - Find the tallest mountain in Minecraft. This might take some time, depending on the size of your world. - Once you've found the tallest mountain, climb to the top. - When you're at the top, find a suitable place to jump off. Make sure there's nothing in the way that could block your jump, and that the area below is clear. - When you're ready, take a running jump off the edge of the mountain. - As you're falling, deploy your parachute. - Enjoy the view as you float down to the ground! task9: name: David-golden-pickaxe prompts: - Find a place with a lot of trees. - Cut down the trees and gather the wood. - Find a place with a lot of stone. - Mine the stone and gather the cobblestone. - Find a place with a lot of iron. - Mine the iron and gather the iron ingots. - Find a place with a lot of gold. - Mine the gold and gather the gold ingots. - Craft a golden pickaxe. task10: name: GPT-golden-pickaxe prompts: - Find a place with a lot of trees. - Cut down the trees and gather the wood. - Find a place with a lot of stone. - Mine the stone and gather the cobblestone. - Find a place with a lot of iron. - Mine the iron and gather the iron ingots. - Find a place with a lot of gold. - Mine the gold and gather the gold ingots. - Craft a golden pickaxe. task11: name: David-Sailboat-with-sheep prompts: - Find a boat. - Place the sheep in the boat. - Right-click on the boat with an empty hand to get in. - Use the WASD keys to move the boat. The sheep should stay in the boat. task12: name: GPT-Sailboat-with-sheep prompts: - Find a boat. - Place the sheep in the boat. - Right-click on the boat with an empty hand to get in. - Use the WASD keys to move the boat. The sheep should stay in the boat. task13: name: David-Shear-sheep prompts: - Find a sheep. - Right-click on the sheep with your hand to interact with it. - When the sheep''s health bar appears, wait for it to turn white. - Left-click on the sheep to shear it. - Collect the wool that appears. task14: name: GPT-Shear-sheep prompts: - Find a sheep. - Right-click on the sheep with your hand to interact with it. - When the sheep''s health bar appears, wait for it to turn white. - Left-click on the sheep to shear it. - Collect the wool that appears. task15: name: David-trap-zombie prompts: - Find a zombie. - Lure the zombie into a house with food. - Close the door to the house so the zombie cannot escape. - Enjoy your new pet zombie! task16: name: GPT-trap-zombie prompts: - Find a zombie. - Lure the zombie into a house with food. - Close the door to the house so the zombie cannot escape. - Enjoy your new pet zombie! ```
aadharna commented 1 year ago

Just wanted to provide the updated task configurations (the only change is that the first prompt on every task is the sentence that you all provided to GPT to generate the curriculum)

Task Configurations ``` task1: name: GPT-Creative-331 prompts: - Find the tallest tree in the game world and build a house in it - The first thing you need to do is find a tree. You can do this by exploring your world or by using a seed that is known to have tall trees. - Once you have found a tree, you need to measure its height. This can be done by using a ruler or by measuring the block height of the tree. - After you have measured the height of the tree, you need to find a spot to build your house. The best spot to build your house is on top of the tree. - Once you have found a spot to build your house, you need to start building it. You can use any type of block to build your house. - After you have built your house, you need to decorate it. You can use furniture, paintings, and other items to decorate your house. - After you have decorated your house, you need to move in. You can do this by placing a bed in your house. - After you have moved in, you need to enjoy your new home. task2: name: David-Creative-331 prompts: - Find the tallest tree in the game world and build a house in it - The first thing you need to do is find a tree. You can do this by exploring your world or by using a seed that is known to have tall trees. - Once you have found a tree, you need to measure its height. This can be done by using a ruler or by measuring the block height of the tree. - After you have measured the height of the tree, you need to find a spot to build your house. The best spot to build your house is on top of the tree. - Once you have found a spot to build your house, you need to start building it. You can use any type of block to build your house. - After you have built your house, you need to decorate it. You can use furniture, paintings, and other items to decorate your house. - After you have decorated your house, you need to move in. You can do this by placing a bed in your house. - After you have moved in, you need to enjoy your new home. task3: name: GPT-create-cookie prompts: - find material and craft to obtain cookie - Find a tree. - Cut the tree down with an axe to obtain wood. - Craft a crafting table with the wood. - Place the crafting table down and open it. - Find the cookie recipe in the crafting menu and craft the cookies. task4: name: David-create-cookie prompts: - find material and craft to obtain cookie - Find a tree. - Cut the tree down with an axe to obtain wood. - Craft a crafting table with the wood. - Place the crafting table down and open it. - Find the cookie recipe in the crafting menu and craft the cookies. task5: name: GPT-creative-118 prompts: - Trap a skeleton in house. - Find a suitable location in your house for the trap. A small room or closet would be ideal. - Place a pressure plate or tripwire at the entrance to the room. - Place a few TNT blocks inside the room. - When the skeleton enters the room, it will trigger the pressure plate or tripwire, causing the TNT to explode and kill the skeleton. task6: name: David-creative-118 prompts: - Trap a skeleton in house. - Find a suitable location in your house for the trap. A small room or closet would be ideal. - Place a pressure plate or tripwire at the entrance to the room. - Place a few TNT blocks inside the room. - When the skeleton enters the room, it will trigger the pressure plate or tripwire, causing the TNT to explode and kill the skeleton. task7: name: David-creative-481 prompts: - Base jump off the tallest mountain - Find the tallest mountain in Minecraft. This might take some time, depending on the size of your world. - Once you've found the tallest mountain, climb to the top. - When you're at the top, find a suitable place to jump off. Make sure there's nothing in the way that could block your jump, and that the area below is clear. - When you're ready, take a running jump off the edge of the mountain. - As you're falling, deploy your parachute. - Enjoy the view as you float down to the ground! task8: name: GPT-creative-481 prompts: - Base jump off the tallest mountain - Find the tallest mountain in Minecraft. This might take some time, depending on the size of your world. - Once you've found the tallest mountain, climb to the top. - When you're at the top, find a suitable place to jump off. Make sure there's nothing in the way that could block your jump, and that the area below is clear. - When you're ready, take a running jump off the edge of the mountain. - As you're falling, deploy your parachute. - Enjoy the view as you float down to the ground! task9: name: David-golden-pickaxe prompts: - Find material and craft a golden pickaxe - Find a place with a lot of trees. - Cut down the trees and gather the wood. - Find a place with a lot of stone. - Mine the stone and gather the cobblestone. - Find a place with a lot of iron. - Mine the iron and gather the iron ingots. - Find a place with a lot of gold. - Mine the gold and gather the gold ingots. - Craft a golden pickaxe. task10: name: GPT-golden-pickaxe prompts: - Find material and craft a golden pickaxe - Find a place with a lot of trees. - Cut down the trees and gather the wood. - Find a place with a lot of stone. - Mine the stone and gather the cobblestone. - Find a place with a lot of iron. - Mine the iron and gather the iron ingots. - Find a place with a lot of gold. - Mine the gold and gather the gold ingots. - Craft a golden pickaxe. task11: name: David-Sailboat-with-sheep prompts: - Sail on boat with a sheep. - Find a boat. - Place the sheep in the boat. - Right-click on the boat with an empty hand to get in. - Use the WASD keys to move the boat. The sheep should stay in the boat. task12: name: GPT-Sailboat-with-sheep prompts: - Sail on boat with a sheep. - Find a boat. - Place the sheep in the boat. - Right-click on the boat with an empty hand to get in. - Use the WASD keys to move the boat. The sheep should stay in the boat. task13: name: David-Shear-sheep prompts: - shear a sheep with shears - Find a sheep. - Right-click on the sheep with your hand to interact with it. - When the sheep''s health bar appears, wait for it to turn white. - Left-click on the sheep to shear it. - Collect the wool that appears. task14: name: GPT-Shear-sheep prompts: - shear a sheep with shears - Find a sheep. - Right-click on the sheep with your hand to interact with it. - When the sheep''s health bar appears, wait for it to turn white. - Left-click on the sheep to shear it. - Collect the wool that appears. task15: name: David-trap-zombie prompts: - Trap a zombie in house - Find a zombie. - Lure the zombie into a house with food. - Close the door to the house so the zombie cannot escape. - Enjoy your new pet zombie! task16: name: GPT-trap-zombie prompts: - Trap a zombie in house - Find a zombie. - Lure the zombie into a house with food. - Close the door to the house so the zombie cannot escape. - Enjoy your new pet zombie! ```

In particular, in the craft-golden-pickaxe task there are subtasks like "find a place with lots of iron" and in the video GPT-golden-pickaxe video we do that but the reward never goes above ~0.3 despite the agent finding iron to mine.

image


However, we ran another experiment interacting with the environment instead of human gameplay with just the "find spider" prompt and using the delta-reward with mineCLIP and it was giving results that seemed more correct! So, we're seeing if the complexity of the GPT-generated sentences were why mineCLIP was not giving good rewards.

That being said, when we switched back to direct correlation as the reward we see the same results as the human videos: image