Results in Figure 4 - Githubissues

coranholmes / TEVAD

Official implementation for paper TEVAD: Improved video anomaly detection with captions

19 stars 4 forks source link

Hi, Can you please tell me how you produced the results shown in your paper in Figure 4? (Figure 4. Example results from (a) ShanghaiTech (riding a bike), (c) XD-Violence (shooting), and (b) UCF-Crime (arrest) datasets showing the contribution of each word in the caption to the snippet anomaly score. An image frame of the abnormal event from the snippet is also shown on the right of each caption.)

How do you get the word's score, did you use an attention map?

In Page 8 of original paper, it describes how the scores are calculated in Figure 4. "The score above each word in the caption is the differ- ence between the anomaly score by masking this word and the original anomaly score without masking."

coranholmes / TEVAD

Results in Figure 4 #8