I am trying to finetune MiniGPT4-Video on my custom dataset. I could not seem to register my own data builder so I modified the Registry method like below.
I have the below in the datasets/builders/image_text_pair_builder.py
@registry.register_builder("engagenet")
class EngageNetBuilder(BaseDatasetBuilder):
train_dataset_cls = EngageNetDataset
DATASET_CONFIG_DICT = {
"default": "configs/datasets/engagenet/default.yaml",
}
print(DATASET_CONFIG_DICT)
def build_datasets(self):
# download, split, etc...
# only called on 1 GPU/TPU in distributed
self.build_processors()
build_info = self.config.build_info # information from the config file
datasets = dict()
# create datasets
dataset_cls = self.train_dataset_cls
datasets['train'] = dataset_cls(
vis_processor=self.vis_processors["train"], # Add the vis_processor here
text_processor=self.text_processors["train"], # Add the text_processor here
vis_root=build_info.vis_root, # Add videos path here
ann_paths=build_info.ann_paths, # Add annotations path here
subtitles_path=build_info.subtitles_path, # Add subtitles path here
model_name='mistral' # Add model name here (llama2 or mistral)
)
return datasets
Hi,
I am trying to finetune MiniGPT4-Video on my custom dataset. I could not seem to register my own data builder so I modified the Registry method like below.
I have the below in the
datasets/builders/image_text_pair_builder.py
How to appropriately register data builder?