Is there any experiment result on this dataset?

pangwenfeng commented 4 years ago

Thanks for your work first! My problem is that have you applied different algorithms on this dataset and what are there performance? In other words, can you show the baseline for the researcher? Thanks again.

mchengny commented 4 years ago

Hi, thanks for your interests. Currently I don't implement other algoritms on this dataset. As you know, it's quite time comsuming to repeat others' models. But I have tried my method on different datasets (refer to arvix). Also, welcome to every one who can offer experimental results of other algorithms on this dataset, and I will organize the ranking in this repo.

pangwenfeng commented 4 years ago

Thanks for your reply! I have read your paper, and I felt confused about the sampling method. In your paper, the sampling method is "sparsely sample frames from the video by a uniform interval"; however, if the length of videos are different, how can you guarantee that the length of video clips after sampling are the same. And if you sample a fixed number of frames, some clips may not include the violence events. Looking forward to your reply, thank you!

------------------ 原始邮件 ------------------ 发件人: "Ming Cheng 程铭"<notifications@github.com>; 发送时间: 2019年11月20日(星期三) 下午3:04 收件人: "mchengny/RWF2000-Video-Database-for-Violence-Detection"<RWF2000-Video-Database-for-Violence-Detection@noreply.github.com>; 抄送: "庞文丰"<448680688@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [mchengny/RWF2000-Video-Database-for-Violence-Detection] Is there any experiment result on this dataset? (#1)

Hi, thanks for your interests. Currently I don't implement other algoritms on this dataset. As you know, it;s quite time comsuming to repeat others' models. But I have tried my method on different datasets (refer to arvix). Also, welcome to every one who can offer experimental results of this dataset on other algorithms, and I will organize the ranking in this repo.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

mchengny commented 4 years ago

Thanks for your question. (1) The uniform interval means that: firstly I set a target number of sampled frames, and then calculate the interval by (video length / target number). Besides, usually the problems of human action recognition could be treated as two categories: un-trimmed and trimmed. In un-trimmed action recognition, we want to konw the start point and end point of an event, while in trimmed case, we hope to know whether it contains the action. This dataset is designed for trimmed video classification (or action recognition, I feel that there is not a very clear boundary between the definitions.) (2) So that, in a trimmed video clip, a kind of human action must be a consecutive actions, it may cross many frames in a short period. The model may not need to capture all the key frames. Actually it's hard to exactly define the start and the end of a complex action. Therefor, from my perspective, that's a bit meaningless. Once the model captures some of the key frames, it could also work well. (3) I also tried some different sampling methods. For example, firstly split the video clip to some shorter snippets, and then randomly sample a frame from each snippet. But these methods didn't gain significant improvements. I hope that there will be more efficient and powerful methods to sample key frames from an entire video. (4) In conclusion, puting longer video clip will really improve the model performance, while it will cost too much computing resources. This is a trade-off between accuracy and speed.

Thank your for your question again. Since the limit of pages in the paper, some details may not be explained clearly, welcome to open an issue in this repo!

mchengny / RWF2000-Video-Database-for-Violence-Detection

Is there any experiment result on this dataset? #1