Questions about clip preprocessing

The ' narrations.json' file published by EGO4D does not provide the start time and end time of each clip, only the timestamp_frame:

First, for each narration_pass, I sort all clips by 'timestamp_frame'. Then,for the current clip, the 'timestamp_frame 'is the start frame, and the 'timestamp_frame' of the next clip is the end frame. However, there are clips longer than 5 minutes in the preprocessed data. This is different from the explanation "each clip was about 200 frames, so about 10 seconds" mentioned in #13.

Is this clip segmentation appropriate? When filtering too long and too short data, what is the maximum and minimum number of frames you set?

facebookresearch / r3m

Questions about clip preprocessing #26