YiLunLee / missing_aware_prompts

Multimodal Prompting with Missing Modalities for Visual Recognition, CVPR'23
https://yilunlee.github.io/missing_aware_prompts/
160 stars 9 forks source link

About the hatememes_dataset.py #25

Open herkerser opened 9 months ago

herkerser commented 9 months ago

I notice that the class Hatememes set the text_column_name to 'plots' as follows `class HateMemesDataset(BaseDataset): def init(self, *args, split="", missing_info={}, **kwargs): assert split in ["train", "val", "test"] self.split = split

    if split == "train":
        names = ["hatememes_train"]
    elif split == "val":
        names = ["hatememes_dev"]
    elif split == "test":
        names = ["hatememes_test"] 

    super().__init__(
        *args,
        **kwargs,
        names=names,
        text_column_name="plots",
        remove_duplicate=False,
    )`

However, the make_arrow in write_hatememes.py, no column named 'plots' is defined, dataframe = pd.DataFrame( data_list, columns=[ "image", "text", "label", "split", ], ) This may cause error "KeyError: 'Field "plots" does not exist in schema'" when training. I wonder if its a mistake or my misunderstood?

CMLF-git-dev commented 9 months ago

I encountered the same error. I think the purpose of 'text_to_columns' is to specify which column is text data. Changing 'plots' to 'text' can solve this error.