CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos
issue about extracting frames from video #5

ray342649093 commented 7 years ago

I tried to repeat your experiment on THUMOS14. So , I downloaded the THUMOS14 test dataset and used part of the code from C3D project to extract frames from 213 videos in the test dataset. Then I got 1351825 frames in total, which was different from the number of frames you extracted (around 1157824 from your postprocess codes). Then I used your python code to generate 42347 bin files while yours was 36182. So, I changed the number of mini batches to 10567 and output 42347 features.

I generated my own ground truth lables per frame and run your postprocess codes. finally got 0.1426 map. I found that the probability looked ugly, most of the frames have high probability for background and others do not have high enough probability for every actions even with a low probability for background. could you see what might be the problem? The code I used to extract frames is attached below.

def get_action_video_id(input_dir):
    filenames = []`
    for root,dirs,files in os.walk(input_dir):
        for i in files:
            f = open(input_dir + i)
            for line in f:
                if line[:18] in filenames:
    return filenames

def get_frame_count(video):
    ''' Get frame counts and FPS for a video '''
    cap = cv2.VideoCapture(video)
    if not cap.isOpened():
        print "[Error] video={} can not be opened.".format(video)

    # get frame counts
    num_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    fps = cap.get(cv2.CAP_PROP_FPS)

    # in case, fps was not available, use default of 29.97
    if not fps or fps != fps:
        fps = 29.97

    return num_frames, fps

def extract_frames(video, start_frame, frame_dir, num_frames_to_extract=16):
    ''' Extract frames from a video using opencv '''

    # check output directory
    if os.path.isdir(frame_dir):
        print "[Warning] frame_dir={} does exist. Will overwrite".format(frame_dir)

    # get number of frames
    cap = cv2.VideoCapture(video)
    if not cap.isOpened():
        print "[Error] video={} can not be opened.".format(video)

    # move to start_frame
    cap.set(cv2.CAP_PROP_POS_FRAMES, start_frame)

    # grab each frame and save
    for frame_count in range(num_frames_to_extract):
        frame_num = frame_count + start_frame
        print "[Info] Extracting frame num={}".format(frame_num)
        ret, frame =
        if not ret:
            print "[Error] Frame extraction was not successful"

        frame_file = os.path.join(
        cv2.imwrite(frame_file, frame)


def main():
    input_annotations_dir = '/home/rusu5516/TH14_Temporal_Annotations_Test/annotations/annotation/'
    filenames = get_action_video_id(input_annotations_dir)
    input_videos_dir = '/home/rusu5516/TH14_test_set_mp4/'
    for file in filenames:
        for root,dirs,files in os.walk(os.path.join(input_videos_dir,'all_frames_pervideo')):
            for i in files:
                if file == i:
                    num_frames, fps = get_frame_count(os.path.join(input_videos_dir,file+'.mp4'))
                    extract_frames(input_videos_dir+file+'.mp4', 0, input_videos_dir+'all_frames_pervideo/'+file, num_frames)

if __name__ == '__main__':
zhengshou commented 7 years ago

Hi, thank you for your interest.

I used FPS=25 for all test videos although it should not affect performances a lot.

As for frame extraction, instead of cv2, I used ffmpeg to extract frames in png format. You could refer to my previous scnn demo code to learn more about this.