v-iashin / MDVC

PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
https://v-iashin.github.io/mdvc
142 stars 19 forks source link

videoCategoriesMetaUS.json #13

Closed niu1970 closed 4 years ago

niu1970 commented 4 years ago
parser.add_argument(
    '--video_categories_meta_path', type=str, default='./data/videoCategoriesMetaUS.json',
    help='Path to the categories meta from Youtube API: \
    https://developers.google.com/youtube/v3/docs/videoCategories/list'
)

Hello!May I ask where I can find the above documents?

v-iashin commented 4 years ago

Hi! Sure!

Let me first start by motivation why it is in the arguments, but you do not see it in ./data/. I have not shared it in the first place because the code does not need it - you can search for video_categories_meta_path in this repo to verify this. The rationale behind leaving it there was the difficulty of adjusting the code to ignore it. Such refactoring could lead to some bugs. So, I decided to leave it there.

Anyway, please let me know if you need the file.

niu1970 commented 4 years ago

When I ran the main.py file, I got the following error, thinking it was a missing data file.Take a look: /home/njj/anaconda3/envs/mdvc/bin/python3.7 /home/njj/niu/video-caption-project/MDVC/MDVC-master/main.py --device_ids 0 Namespace(B=28, H=4, N=1, audio_feature_name='vggish', audio_features_path='./data/sub_activitynet_v1-3.vggish.hdf5', average_audio_feats=False, average_video_feats=False, betas=[0.9, 0.98], comment='', criterion='label_smoothing', d_aud=128, d_cat=None, d_ff_audio=2048, d_ff_subs=2048, d_ff_video=2048, d_model_audio=None, d_model_subs=512, d_model_video=None, d_vid=1024, device_ids=[0], dout_p=0.1, early_stop_after=50, end_token='', epoch_num=45, eps=1e-08, filter_audio_feats=False, filter_video_feats=False, inf_B_coeff=2, log_dir='./log/', lr=1e-05, lr_coeff=None, max_len=50, max_prop_per_vid=1000, min_freq=1, modality='subs_audio_video', model='transformer', one_by_one_starts_at=0, optimizer='adam', pad_token='', reference_paths=['./data/val_1.json', './data/val_2.json'], scheduler='constant', smoothing=0.7, start_epoch=0, start_token='', tIoUs=[0.3, 0.5, 0.7, 0.9], to_log=True, train_meta_path='./data/train_meta.csv', use_categories=False, use_linear_embedder=False, val_1_meta_path='./data/val_1_meta.csv', val_2_meta_path='./data/val_2_meta.csv', val_prop_meta_path='./data/bafcg_val_100_proposal_result.csv', verbose_evaluation=True, video_categories_meta_path='./data/videoCategoriesMetaUS.json', video_feature_name='i3d', video_features_path='./data/sub_activitynet_v1-3.i3d_25fps_stack24step24_2stream.hdf5', videos_to_monitor=['v_GGSY1Qvo990', 'v_bXdq2zI1Ms0', 'v_aLv03Fznf5A'], warmup_steps=None) log_path: ./log/0710133050 model_checkpoint_path: ./log/0710133050 Preparing dataset for train Preparing dataset for val_1 Preparing dataset for val_2 Preparing dataset for val_1 using SubsAudioVideoGeneratorConcatLinearDoutLinear initialization: xavier Param Num: 178749320 13:32:38 train (0): 0%| | 0/1221 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/njj/niu/video-caption-project/MDVC/MDVC-master/main.py", line 572, in main(cfg) File "/home/njj/niu/video-caption-project/MDVC/MDVC-master/main.py", line 278, in main cfg.modality, cfg.use_categories
File "/home/njj/niu/video-caption-project/MDVC/MDVC-master/epoch_loop/run_epoch.py", line 280, in training_loop for i, batch in enumerate(tqdm(loader, desc=f'{time} train ({epoch})')): File "/home/njj/anaconda3/envs/mdvc/lib/python3.7/site-packages/tqdm/std.py", line 1127, in iter for obj in iterable: File "/home/njj/anaconda3/envs/mdvc/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 346, in next data = self.dataset_fetcher.fetch(index) # may raise StopIteration File "/home/njj/anaconda3/envs/mdvc/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/njj/anaconda3/envs/mdvc/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/njj/niu/video-caption-project/MDVC/MDVC-master/dataset/dataset.py", line 450, in getitem to_return = caption_data, *self.features_dataset[caption_data.idx] File "/home/njj/niu/video-caption-project/MDVC/MDVC-master/dataset/dataset.py", line 366, in getitem return self.getitem_2_stream_video(indices) File "/home/njj/niu/video-caption-project/MDVC/MDVC-master/dataset/dataset.py", line 316, in getitem_2_stream_video video_id, start, end, duration, self.get_full_feat File "/home/njj/niu/video-caption-project/MDVC/MDVC-master/dataset/dataset.py", line 33, in load_multimodal_features_from_h5 assert video_stack_rgb.shape == video_stack_flow.shape AttributeError: 'NoneType' object has no attribute 'shape'

v-iashin commented 4 years ago

Why do you think it is related to videoCategoriesMetaUS.json?

Anyway, I would start by comparing md5 sums for your files with features. Make sure they match

# MD5 Hash
a661cfe3535c0d832ec35dd35a4fdc42  sub_activitynet_v1-3.i3d_25fps_stack24step24_2stream.hdf5
54398be59d45b27397a60f186ec25624  sub_activitynet_v1-3.vggish.hdf5

If it is not related to this issue with videoCategoriesMetaUS.json. Please close it and open another one.

niu1970 commented 4 years ago

Ok, thank you!