Open lingyixia opened 12 months ago
Prepare the annotation file:
train_caption_file
: training corpus, refer to this file
val_caption_file
: validation corpus, refer to this file
eval_gt_file_for_grounding
: validation file for video grounding, refer to this file
dict_file
: vocabulary file of your dataset, refer to this file
Prepare the features: Gather each video's features into a .npy file, with the format L * D, where L denotes temporal resolution and D represents the feature dimension. Store these files in a single designated folder for streamlined access.
Prepare the .yaml
file: Create a configuration file for training by modifying the existing cfg file. You can start with the template provided at: Configuration File Template and adjust it using the annotation details mentioned above.
hi,thanks for your work, I have a question that in the train_capion_file ,what does the "area"stands for?
Could you please share a pipline for pre-preparing a new data for training?