The ' narrations.json' file published by EGO4D does not provide the start time and end time of each clip, only the timestamp_frame:
First, for each narration_pass, I sort all clips by 'timestamp_frame'. Then,for the current clip, the 'timestamp_frame 'is the start frame, and the 'timestamp_frame' of the next clip is the end frame. However, there are clips longer than 5 minutes in the preprocessed data. This is different from the explanation "each clip was about 200 frames, so about 10 seconds" mentioned in #13.
Is this clip segmentation appropriate? When filtering too long and too short data, what is the maximum and minimum number of frames you set?
The ' narrations.json' file published by EGO4D does not provide the start time and end time of each clip, only the timestamp_frame:
First, for each narration_pass, I sort all clips by 'timestamp_frame'. Then,for the current clip, the 'timestamp_frame 'is the start frame, and the 'timestamp_frame' of the next clip is the end frame. However, there are clips longer than 5 minutes in the preprocessed data. This is different from the explanation "each clip was about 200 frames, so about 10 seconds" mentioned in #13.
Is this clip segmentation appropriate? When filtering too long and too short data, what is the maximum and minimum number of frames you set?