where can I get '--image_root', default='./data/yc2/video_segments_25fps'?

MichiganCOG / Video-Grounding-from-Text

Source code for "Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction"

Other

44 stars 9 forks source link

Older versions of ffmpeg, use different methods for seeking the starting frame. We used ffmpeg version 2.8.15 for Ubuntu.

And the exact code we used to sample the frames is:

start_t: segment start time (from JSON annotation file) vid_path: source video seg_dur: segment duration (calculate from JSON annotation file) fps: 25 segment_path: save location

os.system(' '.join(('ffmpeg', '-ss', str(start_t), '-i', vid_path, '-t', str(seg_dur), '-vf','"fps='+str(fps),', scale=720:-1"',os.path.join(segment_path,'%02d.jpg'))))

[EDIT]: The frames you are referencing are sampled at 25 fps. The box annotations are only supplied at 1 fps, but vis.py takes care to only ground the annotated frames.

MichiganCOG / Video-Grounding-from-Text

where can I get '--image_root', default='./data/yc2/video_segments_25fps'? #4