Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020.
The project website is XD-Violence. The features can be downloaded from our project website.
where we oversample each video frame with the “5-crop” augment, “5-crop” means cropping images into the center and four corners. _0.npy is the center, _1~ _4.npy is the corners.
run infer.py
the model is in the ckpt folder.
Thanks for your attention!