Closed Chenhongchang closed 7 months ago
I get the target_perframe.npy on the frame-level. For example, if one video time is T seconds, FPS is 4, the frame of the video is 4T, you cant get the annotation by .txt and transfer the seconds in the annotaion to frames (by multiply the FPS), and for each frame, if any action happens in this frame, this action corresponds to the value in the numpy array will plus one (in the class dimension). Finally, you will get an 4T x number of classes's array.
Thanks for your response! In the paper, the videos have 24 frames per second and 6 frames per chunk. Then which frame's label is used as the chunk-wise label? For example, when one chunk has the frame labeled action 1 and another frame labeled action 2, which is the annotation of this chunk, action1 or action 2? And How about the last chunk that might not have enough 6 frames? Will it be ignored or padded?
Thanks for your response! In the paper, the videos have 24 frames per second and 6 frames per chunk. Then which frame's label is used as the chunk-wise label? For example, when one chunk has the frame labeled action 1 and another frame labeled action 2, which is the annotation of this chunk, action1 or action 2? And How about the last chunk that might not have enough 6 frames? Will it be ignored or padded?
You can use the action of the center frame in the chunk as the target label. When the last chunk doesn't have 6 frames, I just ignore it.
Thanks for your response!
I'm uncertain about how to create the target_perframe file. Could you please describe how to create it and provide the relevant code? It would greatly assist me.