Training on Ped2 is unstable

SDret commented 1 year ago

Thanks to the authors for bringing such an impressive work in VAD, for now I believe it is almost the best work in this field!

Just wondering when the complete version of reproducing the benchmark results in paper can be released? Since with the current version of this code, I have difficulty in producing the desired result on ped2.

FlappyPeggy commented 1 year ago

Thanks to the authors for bringing such an impressive work in VAD, for now I believe it is almost the best work in this field!

Just wondering when the complete version of reproducing the benchmark results in paper can be released? Since with the current version of this code, I have difficulty in producing the desired result on ped2.

Thank you very much for this issue!

To avoid private code differs from the GitHub version, we clone the released code and re-evaluate the Evaluate_ped2.py by giving params "dataset_path" and "model_dir".

AUC is still 99.7 (99.690456%) as the paper shown.

So I wonder if your implementation details (e.g. dataset, image pre-processing, post-processing, etc.) are slightly different?

FlappyPeggy commented 1 year ago

Thanks to the authors for bringing such an impressive work in VAD, for now I believe it is almost the best work in this field!

Just wondering when the complete version of reproducing the benchmark results in paper can be released? Since with the current version of this code, I have difficulty in producing the desired result on ped2.

The complete version is expected to be released no later than the inclusion of THIS paper, or as the corresponding paper of the improved part is accepted.

In fact, we hope to draw attention on the meaning and effect of diversity measurement (modeling), rather than being limited to the specific deformation-based implementation.

FlappyPeggy commented 1 year ago

Thanks to the authors for bringing such an impressive work in VAD, for now I believe it is almost the best work in this field!

Just wondering when the complete version of reproducing the benchmark results in paper can be released? Since with the current version of this code, I have difficulty in producing the desired result on ped2.

I noticed another possible reason, see readme.md -> Training & Testing -> Pre-Processed Files. Undesired results could be caused by loading incorrect background template (or no template is loaded).

SDret commented 1 year ago

Thanks for the response! I could re-produce the result with the pre-trained model, however, when I write the training scripts with the training configuration given in paper, I can just get ~97% AUROC on ped2. So I wonder if it is possible to release the training code on ped2, or give some hint of this issue(I have already used the provided background jpeg)? Thanks!

FlappyPeggy commented 1 year ago

The instability of the training process is mainly due to the uniform hyperparameters (especially for ped2), you may benefit from the following information:

It may be beneficial to constrain smoothness in both directions separately.

class Smooth_Loss(nn.Module):
def __init__(self, channels=2, ks=7, alpha=1):
    super(Smooth_Loss, self).__init__()
    self.alpha = alpha
    self.ks = ks
    filter = torch.FloatTensor([[-1 / (ks - 1)] * ks]).cuda()
    filter[0, ks // 2] = 1
    self.filter_x = filter.view(1, 1, 1, ks).repeat(1, channels, 1, 1)
    self.filter_y = filter.view(1, 1, ks, 1).repeat(1, channels, 1, 1)

def forward(self, gen_frames):
    gen_frames_x = nn.functional.pad(gen_frames, (self.ks // 2, self.ks // 2, 0, 0))
    gen_frames_y = nn.functional.pad(gen_frames, (0, 0, self.ks // 2, self.ks // 2))
    gen_dx = nn.functional.conv2d(gen_frames_x, self.filter_x)
    gen_dy = nn.functional.conv2d(gen_frames_y, self.filter_y)
    smooth_xy = torch.abs(gen_dx) + torch.abs(gen_dy)

    return torch.mean(smooth_xy ** self.alpha)

If you don't want to change the hyperparameters in the paper, I suggest do NOT enable deterministic training. For unknown reasons, deterministic training seems to hurt performance in some cases (regardless of the seed). In this case, you may need to re-run the experiment multiple times with different seeds
```
torch.backends.cudnn.deterministic = False
```

Or you can get relatively stable results by modifying the parameter $\gamma$ and enabling early-stop (because the strength constraint is inherently against the reconstruction loss).

self.grad_loss = self.beta[0]*(self.loss_grad(x, recon_x).mean() + self.loss_grad(x, z_q_).mean()*0.25)
self.offset_loss1 = ((offset1 ** 2).sum(dim=-1) ** 0.5).mean()*0.4
self.offset_loss2 = ((offset2 ** 2).sum(dim=-1) ** 0.5).mean()*0.4

Due to unknown reasons, the results of Ped2 and ShanghaiTech on 2080Ti-torch1.9.0 are significantly worse than those on 1080Ti-torch1.6.0 or torch 1.9.0. But similar situation has not been observed on Avenue.
We also found the mainly reason of instability, which will be linked to this repo after the relevant paper is accepted (the improved version will be more stable: std: 0.6->0.2).

SDret commented 1 year ago

Thanks! I will try it to see if further stableness or performance gain can be obtained

zhangzilongc commented 1 year ago

I also think that this is the best work in VAD, especially for industrial AD.

Sweiying commented 3 months ago

Thanks for the response! I could re-produce the result with the pre-trained model, however, when I write the training scripts with the training configuration given in paper, I can just get ~97% AUROC on ped2. So I wonder if it is possible to release the training code on ped2, or give some hint of this issue(I have already used the provided background jpeg)? Thanks!

Hi, I would like to inquire if you have achieved the accuracy in the paper on avenue

FlappyPeggy / DMAD

Training on Ped2 is unstable #1