Some trajectories missed in the frames and the question in table 1.

RomeroBarata / skeleton_based_anomaly_detection

Code for the CVPR'19 paper "Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos"

132 stars 49 forks source link

Some trajectories missed in the frames and the question in table 1. #11

Open pangwenfeng opened 4 years ago

pangwenfeng commented 4 years ago

Hello, thanks for your inspiring work first. I have a question that after I visualized the keypoint trajectories, I found that some trajectories of the targets may be missed, especially when the target moves fast, for example, when the man is running. I think this situation will affect performance seriously. So how do you process this problem? And I found that in table 1, the performance (0.704) of Conv-AE[13] is much different from the result (0.609) shown in the paper Future Frame Prediction for Anomaly Detection - a New Baseline., which confuses me. Looking forward to your reply, thank you very much.

Cambrainnnnn commented 4 years ago

I also found this question. In the paper, the author said in the appendix:

As for the HR-Avenue dataset, since the original Avenue dataset contains only 21 testing videos, we ignored segments of the videos where the anomalies were not detectable by the pose detector we employed or where the anomaly was not related to a human.

And in the section 5:

Apparently, MPED-RNN’s performance still depends on the quality of skeleton detection and tracking. This problem becomes more significant in the case of low quality videos. It prevents us from trying our method on UCSD Ped1/Ped2 [21], another popular dataset whose video quality is too low to detect skeletons.

I assume that author has found this question, but i don't know if he has any solution. The missing of trajectories of target is caused by the pose detector. Maybe a precise detector could solve this problem, this would be another filed.

pangwenfeng commented 4 years ago

Hi, @Cambrainnnnn, thanks for your reply. I think the authors discard these frames by checking them frame by frame and do not use them in the testing phase, which may be a time-consuming work. And I found that in table 1, the performance (0.704) of Conv-AE[13] is much different from the result (0.609) shown in the paper Future Frame Prediction for Anomaly Detection - a New Baseline. Do you know the reason? Thanks again.

Cambrainnnnn commented 4 years ago

I don't notice this differences. Just wait for author to answer.

RomeroBarata commented 4 years ago

Hi @pangwenfeng and @Cambrainnnnn ,

Indeed, we rely on the pose detector and tracker, and anything they miss we just ignore. The tracker itself still tries to recover a trajectory even if the detector missed a few frames, but if the detector misses more than five frames the tracker is likely to start a new trajectory. Missed detections are then filled with zeros in the data.

As for the difference between the performance of Conv-AE reported by us and the one reported in previous work, it is a re-implementation issue. The original work of Conv-AE doesn't perform experiments on the ShanghaiTech dataset, so we independently tried to reproduce it on the ShanghaiTech dataset.

A more accurate detector and tracker would definitely improve the performance of the method, as noted by @Cambrainnnnn. Especially for the Avenue dataset, most sources of error were related to bad detection and tracking.