tianyu0207 / RTFM

Official code for 'Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning' [ICCV 2021]
324 stars 77 forks source link

The excat changes to make to train on Ucf-crime #20

Closed nathanb97 closed 3 years ago

nathanb97 commented 3 years ago

Hello! First, congratulaton for the excelent paper.

What are the excat changes to make to train your model on Ucf-Crime, with your code.

To what I noted in the paper and in the various issues:

Because I do not achieve the same performance even after these changes. I get a maximum of :

tianyu0207 commented 3 years ago

Hello! First, congratulaton for the excelent paper.

What are the excat changes to make to train your model on Ucf-Crime, with your code.

To what I noted in the paper and in the various issues:

  • assign args.batch_size = 32 (When we concatenate we get a batch size of 64. Moreover when I set to 64 (128) the results do not increase)
  • weight_decay = 0.0005
  • and in the dataset.py:
 if self.is_normal:
     self.list = self.list [810:]
 else:
     self.list = self.list [: 810]

Is that all?

Because I do not achieve the same performance even after these changes. I get a maximum of :

  • auc: 0.75
  • pr_auc: 0.18588291392503292

Hi The only thing needs to change from the GitHub is the normal and abnormal videos in the dataset.py. You should get around 84% AUC with only this change. Thanks.

if self.is_normal: self.list = self.list [810:] else: self.list = self.list [: 810]

nathanb97 commented 3 years ago

thank you for your reply, but it's already done After several tests I always get a maximum of 0.75 auc This seems to be explained by the fact that we do not have the same preprocessing Can you tell how you did the preprocessing before I3d. Because it seems that in the i3d repo that you have advised the tensors that pass in i3d be normalized between -1 and 1 on the kinetics dataset.

mean = [114.75, 114.75, 114.75]
std = [57.375, 57.375, 57.375]
With a normalization made on the kinetics dataset.
class GroupNormalize (object):
    def __call __ (self, tensor): # (T, 3, 224, 224)
        for b in range (tensor.size (0)):
            for t, m, s in zip (tensor [b], self.mean, self.std):
                t.sub_ (m) .div_ (s)
        return tensor

Have you performed a standardization on UCF-crime or have you kept the standardization of kinetics? Was the normalization between 0 and 1 or -1 and 1?

I did this tenCrop without normalization (only normalized between 0 and 1) and obtained a maximum result of 0.75:

crop10 = transforms.Compose ([
    transforms.Resize (256),
    transforms.TenCrop (256), # this is a list of PIL Images
    transforms.Lambda (lambda crops: torch.stack ([transforms.ToTensor () (crop) for crop in crops])) # returns a 4D tensor
    #optional uncoment this line: transforms.Normalize (mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225])

 ])

Please can you specify exactly how you do the tenCrop on an image ?

nathanb97 commented 3 years ago

today I tested, a normalization with GroupNormalize and value mean = [94.9191, 93.7068, 92.1115] var = [39.12006135, 38.95593793, 39.355997]

Here are some results between my preprocess and your preprocess

norm between preprocess i3d : 583.2708129882812
their score  max: 0.975659191608429 mean : 0.6715472936630249
our score:   max: 0.9387404322624207 mean : 0.6065698266029358

norm between preprocess i3d : 516.7003784179688
their score  max: 0.8348086476325989 mean : 0.4239957332611084
our score:   max: 0.8746155500411987 mean : 0.6011001467704773

norm between preprocess i3d : 710.51416015625
their score  max: 0.9999951124191284 mean : 0.3108385503292084
our score:   max: 0.8753393292427063 mean : 0.5222576260566711
tianyu0207 commented 3 years ago

today I tested, a normalization with GroupNormalize and value mean = [94.9191, 93.7068, 92.1115] var = [39.12006135, 38.95593793, 39.355997]

Here are some results between my preprocess and your preprocess

norm between preprocess i3d : 583.2708129882812
their score  max: 0.975659191608429 mean : 0.6715472936630249
our score:   max: 0.9387404322624207 mean : 0.6065698266029358

norm between preprocess i3d : 516.7003784179688
their score  max: 0.8348086476325989 mean : 0.4239957332611084
our score:   max: 0.8746155500411987 mean : 0.6011001467704773

norm between preprocess i3d : 710.51416015625
their score  max: 0.9999951124191284 mean : 0.3108385503292084
our score:   max: 0.8753393292427063 mean : 0.5222576260566711

Below is my feature extraction setup.

mean = [114.75, 114.75, 114.75] std = [57.375, 57.375, 57.375]

split == '10_crop_ucf': transform = transforms.Compose([ gtransforms.GroupResize(256), gtransforms.GroupTenCrop(224), gtransforms.ten_crop_ToTensor(), gtransforms.GroupNormalize_ten_crop(mean, std), gtransforms.LoopPad(max_len), ])

class GroupTenCrop(object): def init(self, size): transform = torchvision.transforms.Compose([ torchvision.transforms.TenCrop(size), torchvision.transforms.Lambda(lambda crops: torch.stack([torchvision.transforms.ToTensor()(crop) for crop in crops])), ]) self.worker = transform def call(self, img_group): return [self.worker(img) for img in img_group]

class ToTensor(object): def init(self): self.worker = lambda x: F.to_tensor(x) * 255 def call(self, img_group): img_group = [self.worker(img) for img in img_group] return torch.stack(img_group, 0)

daviduarte commented 3 years ago

Same problem here. I used pre computed I3D. I run 3 times (50 epochs each), and got the following AUC in test set (testing each 5 epochs):

0.791 (in epoch 50) 0.780 (in epoch 10) 0.80 (in epoch 10)

coranholmes commented 2 years ago

Hi I have got the same issue, is it possible for any of you to share the codes for preprocessing 10crop?