Problem in reproducing results on Avenue

NeeluMadanCreate commented 2 years ago

Hi,

Thank you for sharing your contribution to anomaly detection. I tried to reproduce the baseline on the Avenue dataset. However, I got 89.06% AUC after retraining on Avenue. I got only 90.3% even with the pre-trained model. In the paper, it is 91.1%. Could you please share what could be a possible cause for this mismatch?

Thanks! Neelu

LiUzHiAn commented 2 years ago

Hi,

I run the evaluation code again, as python eval.py --model_save_path=./pretrained_ckpts/avenue_HF2VAD_91.15.pth --cfg_file=./pretrained_ckpts/avenue_HF2VAD_91.15_cfg.yaml, the result is still 0.91147.

Did you following the same preprocessing procedure? And did you download the right pre-trained weights and the corresponding config file?

LiUzHiAn commented 2 years ago

auc

NeeluMadanCreate commented 2 years ago

Thanks for your reply, I followed the same pre-processing steps you mentioned on the repository; however, I was wondering how did you get "Test001_gt"? I didn't get them anyhow using the same preprocessing steps. I thought GT file is already inside "ground_truth_demo" folder. Could you please check?

Thanks again!

LiUzHiAn commented 2 years ago

These pixel-level ground truth masks were officially provided by the UCSD Ped2 dataset, you can download them from the dataset page.

The provided gt_label.json in this repo was exactly built by these pixel-level masks. So, you don't need to get these XXX_gt folders actually.

NeeluMadanCreate commented 2 years ago

Thanks for confirming! Could you please also confirm on below: 1. In finetune_cfg.yaml, the parameters doesn't seem okay to me: mem_usage: [ False, False, False, True ] skip_ops: [ "none", "none", "none" ]

shouldn't they be as below? mem_usage: [ False, True, True, True ] skip_ops: [ "none", "concat", "concat" ]

For shanghaitech, you mentioned two memory module, that means it should be like mem_usage: [ False, False, True, True ] skip_ops: [ "none", "concat", "concat" ]

Request you please confirm. Thanks!

LiUzHiAn commented 2 years ago

@NeeluMadanCreate,

You are right and thanks for pointing out this. I've updated the finetune_cfg.yaml file.
Not exactly, for ShanghaiTech, it should be mem_usage: [ False, False, True, True ], skip_ops: [ "none", "none", "concat" ].

You can refer to the config files here, and use the corresponding pre-trained weights to get 91.1% result.

LiUzHiAn commented 2 years ago

@NeeluMadanCreate

Since the 91.15% result on Avenue can be obtained, I close this issue.

Meenn commented 2 years ago

Hi,Thank you for your contribution and for providing the code! I followed the same pre-processing steps you mentioned on the repository to prepare the data.And i run eval with the pre-trained weights for ped2 and avenue.As a result,i got the 0.993078 for ped2 and 0.903187 for avenue. I got the right result for peds.I think it means the process for preparing the data is right?Could you please share what could be a possible cause for the result of avenue(0.903187)?

LiUzHiAn commented 2 years ago

Did you use the corresponding config file (say, avenue_HF2VAD_91.15_cfg.yaml) for avenue?

Meenn commented 2 years ago

Thanks for your reply. Yes,i have download the the right pre-trained weights and the corresponding config file,and i run eavl as "python eval.py \ --model_save_path=./pretrained_ckpts/avenue_HF2VAD_91.15.pth \ --cfg_file=./pretrained_ckpts/avenue_HF2VAD_91.15_cfg.yaml" Finally i got 0.903187 but not 0.91147 as you mentioned.Could you please check if you can get the result of 0.91147 using pre-trained weights from here?

Thanks!

LiUzHiAn commented 2 years ago

Yes, I can get the 0.91147 after using the ckpt and config file from google cloud where I shared. Could you please leave your email, which will be more convenient to share some files?

LiUzHiAn commented 2 years ago

Hi, guys

Thanks for pointing this issue. It was caused by a little bug during preprocessing. For the testing samples, I saved them into a single chunk file and extract 100K samples by default (check here for details). But this would miss some samples if the total sample number is greater than 100K, hence leading to the inconsistent AUROC results.

I've fixed this issue here and run the evaluation phase again, and the results can be obtained.

Thank you.

LiUzHiAn commented 2 years ago

Further, we found the extracted flows will take some effects on the AUROC.

We resize the raw images to meet the requirements of FlowNet2, the exact sizes are:

FLOWNET_INPUT_WIDTH = {"ped2": 512 * 2, "avenue": 512 * 2, "shanghaitech": 1024}
FLOWNET_INPUT_HEIGHT = {"ped2": 384 * 2, "avenue": 384 * 2, "shanghaitech": 640}

One can check more details here

LiUzHiAn / hf2vad

Problem in reproducing results on Avenue #3