Open cuixianheng opened 2 years ago
Hello, I would like to ask how you determined the detection frame of the same person in five adjacent frames in the paper “Cloze Test Helps: Effective Video Anomaly Detection via Learning to Complete Video Events”?
If I understand your question correctly, please refer to the last paragraph in Sec. 3.1 (i.e., Spatio-temporal Cube Construction) of our paper. It should be noted that we do not track the objects. For each object, we use its bounding box/RoI at the current frame to extract 5 patches from current and adjacent 4 frames. Since the movement of an object is usually small in a few adjacent frames, we can use the same bounding box to cover this object in adjacent frames.
Hello, I would like to ask how you determined the detection frame of the same person in five adjacent frames in the paper “Cloze Test Helps: Effective Video Anomaly Detection via Learning to Complete Video Events”?