dingfengshi / TriDet

[CVPR2023] Code for the paper, TriDet: Temporal Action Detection with Relative Boundary Modeling
MIT License
160 stars 13 forks source link

How do you understand this code? #29

Open elmsamcht2189 opened 8 months ago

elmsamcht2189 commented 8 months ago
    # convert time stamp (in second) into temporal feature grids
    # ok to have small negative values here

(video_item['segments'] video_item['fps'] - 0.5 self.num_frames) / feat_stride

dingfengshi commented 8 months ago

the seconds * fps is the frame index. The temporal feature start from the center of the first window (size=num_frame), so the center offset is nun_frame/2. The next feature is the center of the second window, whose index is last_index+feat_stride.