Closed adeljalalyousif closed 1 year ago
Hi adeljalalyousif
To train the captioning module on ground truth proposals, run the following:
# conda activate bmt
python main.py \
--procedure train_cap \
--B 32
Thanks for your response, but I got this error "FileNotFoundError: [Errno 2] No such file or directory './best_prop_model.pt' " :
{'B': 32,
'H': 4,
'N': 2,
'anchors_num_audio': 48,
'anchors_num_video': 128,
'audio_feature_name': 'vggish',
'audio_feature_timespan': 0.96,
'audio_features_path': './data/vggish_npy/',
'avail_mp4_path': './data/available_mp4.txt',
'betas': [0.9, 0.999],
'conv_layers_audio': [512, 512],
'conv_layers_video': [512, 512],
'd_aud': 128,
'd_ff_audio': None,
'd_ff_caps': None,
'd_ff_video': None,
'd_model': 1024,
'd_model_audio': None,
'd_model_caps': 300,
'd_model_video': None,
'd_vid': 1024,
'debug': False,
'device_ids': [0],
'dout_p': 0.1,
'early_stop_after': 30,
'end_token': '',
'epoch_num': 4,
'eps': 1e-08,
'feature_timespan_in_fps': 64,
'finetune_cap_encoder': False,
'finetune_prop_encoder': False,
'fps_at_extraction': 25,
'grad_clip': None,
'inf_B_coeff': 2,
'kernel_sizes_audio': [5, 13, 23, 35, 51, 69, 91, 121, 161, 211],
'kernel_sizes_video': [1, 5, 9, 13, 19, 25, 35, 45, 61, 79],
'layer_norm': False,
'log_dir': './log/',
'lr': 5e-05,
'lr_patience': None,
'lr_reduce_factor': None,
'max_len': 30,
'max_prop_per_vid': 100,
'min_freq_caps': 1,
'modality': 'audio_video',
'model': 'av_transformer',
'momentum': 0.0,
'nms_tiou_thresh': None,
'noobj_coeff': 100,
'obj_coeff': 1,
'one_by_one_starts_at': 1,
'optimizer': 'adam',
'pad_audio_feats_up_to': 800,
'pad_token': '',
'tIoUs': [0.3, 0.5, 0.7, 0.9],
'to_log': True,
'train_json_path': './data/train.json',
'train_meta_path': './data/train.csv',
'unfreeze_word_emb': False,
'use_linear_embedder': False,
'val_1_meta_path': './data/val_1.csv',
'val_2_meta_path': './data/val_2.csv',
'val_prop_meta_path': None,
'video_feature_name': 'i3d',
'video_features_path': './data/i3d_25fps_stack64step64_2stream_npy/',
'weight_decay': 0,
'word_emb_caps': 'glove.840B.300d'}
Contructing caption_iterator for "train" phase
Contructing caption_iterator for "val_1" phase
Contructing caption_iterator for "val_2" phase
Using vanilla Generator
initialization: xavier
Glove emb of the same size as d_model_caps
Pretrained prop path:
./best_prop_model.pt
Traceback (most recent call last):
File "main.py", line 200, in
###########################################################################
I need to train the captioning module on ground truth proposals without using learned proposals
after downloading 'best_prop_model.pt' the training is work but on cpu, how to making training run on gpu, I have
(RTX-3060, 6G) I think my gpu RAM is insufficient .
So how to train the captioning module based on ground truth proposals without using learned proposals
Hi Iashin, I need to train the captioning module on ground truth proposals. What should I do?