Unexpected results on Shanghai Tech

Hi, I recently tried to test your network on Shanghai Tech dataset but I have been unable to reproduce the results of your paper. Could you help me identify the issue ?

Here are the results I got : " Frame-based AUC is 0.767 on shanghaiTech (all data set). Avg. (on video) frame-based AUC is 0.829 on shanghaiTech. "

Compared to the 82.4% of Micro-AUC and 89.3% of Macro-AUC given on github.

And here are the parameters and applied modifications by file to comply with the readme file:

args.py:
    block_scale = 20

utils.py:
    ProcessingType.TRAIN = "training"
    ProcessingType.TEST = "testing"

train.py:
    temporal_size = 15
    extract_objects(ProcessingType.TRAIN, is_video=True)

test.py:
    temporal_size = 3
    extract_objects(ProcessingType.TEST, is_video=False)

compute_performance_score.py:
    predict_anomaly_on_frames:
        commented the 3D convolution line

    compute_performance_indices:
        filter_2d = gaussian_filter_(np.arange(1, 202), 31)

I also downloaded YOLOV3 weights from the provided github and used a detection threshold of 0.5. And I produced the Shanghaitech dataset frame-level ground truth (ground_truth_frame_level.txt) from the shanghaitech_semantic_annotation.json file from (https://github.com/svip-lab/MLEP/tree/master/data/annotations)

Am I missing something ?

Thank you,

Sincerely yours,

Olivier

Hi Olivier,

The parameters that you used seems correct. The only thing that can be missing is that I do not choose the best checkpoint based on the validation error programmatically (i.e. I saved all the checkpoints, and based on the training logs I choose the best epoch manually), and this epoch should be specified manually when testing. If you did not specify the epoch for testing, the last checkpoint is loaded. Because ShanghaiTech has almost 1M objects, I trained it for 3 epochs. How long did you train the network for?

The detection threshold for ShanghaiTech should be set to 0.8 (only for Ped2, we set it to 0.5). When I downloaded the ShanghaiTech data set, I used their official site: https://svip-lab.github.io/dataset/campus_dataset.html . I know that the link is broken now. The repo that you used seems to do open-set anomaly detection and they do not use the standard training/testing split. Did you use the standard training/testing split for one-class anomaly detection?

Best, Lili

Hi,

First of all, thank you very much for such a quick answer :)

I retried to train and infer following your advice:

I changed the prediction threshold of YOLO V3 to 0.8. I retrained the network on the new YOLO predictions and selected manually the epoch with the lowest validation loss (which in my case occurred more clearly on the 5th epoch) I changed the ground truth anomaly labels : I converted the data contained in the test_frame_mask folder to a txt file for each video.

I cannot reproduce the results yet. Do you have any clues ?

Best, Olivier

Here are the logs:

   -------------------------- TRAIN --------------------------

     ==============================

   2022-05-05 10:15:03.229132 - 

    Starting the algorithm with the following parameters: 
   np=<module 'numpy' from '/usr/local/lib/python3.6/dist-packages/numpy/__init__.py'>
   tf=<module 'tensorflow' from '/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py'>
   os=<module 'os' from '/usr/lib/python3.6/os.py'>
   ProcessingType=<enum 'ProcessingType'>
   log_message=<function log_message at 0x7f41269e0620>
   check_file_existence=<function check_file_existence at 0x7f41269e08c8>
   pdb=<module 'pdb' from '/usr/lib/python3.6/pdb.py'>
   sys=<module 'sys' (built-in)>
   create_dir=<function create_dir at 0x7f41269e0840>
   operating_system=linux
   tf_config=gpu_options {
per_process_gpu_memory_fraction: 0.5
     }

   temporal_size=15
   temporal_offsets=[-15 -14 -13 -12 -11 -10  -9  -8  -7  -6  -5  -4  -3  -2  -1   0   1   2
 3   4   5   6   7   8   9  10  11  12  13  14  15]
   detection_threshold=0.8
   database_name=shanghaiTech
   output_folder_base=/tmp/laurendin/datasets/shanghaiTech/output_yolo_0.80
   input_folder_base=/tmp/laurendin/datasets/shanghaiTech/
   samples_folder_name=images_15_0.80
   samples_folder_name_context=images_with_context_15_0.80
   optical_flow_folder_name=optical_flow_15_0.80
   meta_folder_name=meta_15_0.80
   imagenet_logits_folder_name=imagenet_logits_before_softmax
   set_temporal_size=<function set_temporal_size at 0x7f41269e0f28>
   block_scale=20
   logs_folder=logs
   num_samples_for_visualization=500
   CHECKPOINTS_PREFIX=conv3d_4_tasks_0.5_mae_wide_deep_resnet_3_obj_relu_resnet
   CHECKPOINTS_BASE=/tmp/laurendin/datasets/shanghaiTech/output_yolo_0.80/shanghaiTech/checkpoints/conv3d_4_tasks_0.5_mae_wide_deep_resnet_3_obj_relu_resnet
   allowed_video_extensions=['avi', 'mp4']
   allowed_image_extensions=['jpg', 'png', 'jpeg']
   RESTORE_FROM_HISTORY=False
   history_filename=history_shanghaiTech_%s.txt
   log_parameters=<function log_parameters at 0x7f41269e0ea0>

==============================

   2022-05-05 18:55:30.842101 - Epoch: 0/30
    ==============================

   2022-05-06 02:54:02.902228 - loss train = 0.5892, val = 0.5816
   ==============================

   2022-05-06 02:54:02.902426 - acc fwd train = 0.9452, val = 0.9465
   ==============================

   2022-05-06 02:54:02.902554 - acc cons train = 0.9635, val = 0.9645
   ==============================

   2022-05-06 02:54:02.902655 - loss_resnet = 0.5798, loss_recon = 0.0825
   ==============================

   2022-05-06 02:54:03.569400 - Epoch: 1/30
   ==============================

   2022-05-06 05:54:00.989478 - loss train = 0.4504, val = 0.5455
   ==============================

   2022-05-06 05:54:00.989691 - acc fwd train = 0.9768, val = 0.9671
   ==============================

   2022-05-06 05:54:00.989845 - acc cons train = 0.9800, val = 0.9539
   ==============================

   2022-05-06 05:54:00.989939 - loss_resnet = 0.5721, loss_recon = 0.0711
   ==============================

   2022-05-06 05:54:01.352110 - Epoch: 2/30
   ==============================

   2022-05-06 08:04:16.559149 - loss train = 0.4209, val = 0.5013
   ==============================

   2022-05-06 08:04:16.559325 - acc fwd train = 0.9828, val = 0.9664
   ==============================

   2022-05-06 08:04:16.559439 - acc cons train = 0.9831, val = 0.9715
   ==============================

   2022-05-06 08:04:16.559519 - loss_resnet = 0.5653, loss_recon = 0.0671
   ==============================

   2022-05-06 08:04:16.942609 - Epoch: 3/30
   ==============================

   2022-05-06 10:13:40.053891 - loss train = 0.4061, val = 0.4752
   ==============================

   2022-05-06 10:13:40.054062 - acc fwd train = 0.9857, val = 0.9766
   ==============================

   2022-05-06 10:13:40.054173 - acc cons train = 0.9841, val = 0.9758
   ==============================

   2022-05-06 10:13:40.054251 - loss_resnet = 0.5730, loss_recon = 0.0682
   ==============================

   2022-05-06 10:13:40.422771 - Epoch: 4/30
   ==============================

   2022-05-06 12:23:39.171253 - loss train = 0.3965, val = 0.4755
   ==============================

   2022-05-06 12:23:39.171425 - acc fwd train = 0.9875, val = 0.9769
   ==============================

   2022-05-06 12:23:39.171544 - acc cons train = 0.9851, val = 0.9694
   ==============================

   2022-05-06 12:23:39.171626 - loss_resnet = 0.5571, loss_recon = 0.0655
   ==============================

   2022-05-06 12:23:39.532931 - Epoch: 5/30
   ==============================

   2022-05-06 14:30:42.272644 - loss train = 0.3881, val = 0.4327
   ==============================

   2022-05-06 14:30:42.272787 - acc fwd train = 0.9889, val = 0.9782
   ==============================

   2022-05-06 14:30:42.272894 - acc cons train = 0.9863, val = 0.9853
   ==============================

   2022-05-06 14:30:42.273000 - loss_resnet = 0.5490, loss_recon = 0.0658
   ==============================

   -------------------------- TEST --------------------------

   ==============================

   2022-05-10 08:59:23.187187 - 

   Starting the algorithm with the following parameters: 
 np=<module 'numpy' from '/usr/local/lib/python3.6/dist-packages/numpy/__init__.py'>
 tf=<module 'tensorflow' from '/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py'>
 os=<module 'os' from '/usr/lib/python3.6/os.py'>
 ProcessingType=<enum 'ProcessingType'>
 log_message=<function log_message at 0x7f87af1af488>
 check_file_existence=<function check_file_existence at 0x7f87af1af730>
 pdb=<module 'pdb' from '/usr/lib/python3.6/pdb.py'>
 sys=<module 'sys' (built-in)>
 create_dir=<function create_dir at 0x7f87af1af6a8>
 operating_system=linux
 tf_config=gpu_options {
     per_process_gpu_memory_fraction: 0.5

   }

 temporal_size=3
 temporal_offsets=[-3 -2 -1  0  1  2  3]
 detection_threshold=0.8
 checkpoint_epoch=5
 database_name=shanghaiTech
 output_folder_base=/tmp/laurendin/datasets/shanghaiTech/output_yolo_0.80
 input_folder_base=/tmp/laurendin/datasets/shanghaiTech/
 samples_folder_name=images_3_0.80
 samples_folder_name_context=images_with_context_3_0.80
 optical_flow_folder_name=optical_flow_3_0.80
 meta_folder_name=meta_3_0.80
 imagenet_logits_folder_name=imagenet_logits_before_softmax
 set_temporal_size=<function set_temporal_size at 0x7f87af1aff28>
 block_scale=20
 logs_folder=logs
 num_samples_for_visualization=500
 CHECKPOINTS_PREFIX=conv3d_4_tasks_0.5_mae_wide_deep_resnet_3_obj_relu_resnet
 CHECKPOINTS_BASE=/tmp/laurendin/datasets/shanghaiTech/output_yolo_0.80/shanghaiTech/checkpoints/conv3d_4_tasks_0.5_mae_wide_deep_resnet_3_obj_relu_resnet
 allowed_video_extensions=['avi', 'mp4']
 allowed_image_extensions=['jpg', 'png', 'jpeg']
 RESTORE_FROM_HISTORY=False
 history_filename=history_shanghaiTech_%s.txt
 log_parameters=<function log_parameters at 0x7f87af1afea0>

   ==============================

   2022-05-10 08:59:23.187309 - Function compute_anomaly_scores_per_object has started.
   ==============================

   ...

   ==============================

   2022-05-10 09:14:58.228052 - Frame-based AUC is 0.752 on shanghaiTech (all data set).
   ==============================

   2022-05-10 09:14:58.228198 - Avg. (on video) frame-based AUC is 0.850 on shanghaiTech.
   ==============================

   2022-05-10 09:14:58.228310 - Function compute_performance_indices has ended.

Hi Olivier,

Did you change the alpha value from the loss formula (formula 5 in the paper, section parameter tuning for values)? In the released code, the alpha is set for Ped2 (middle_fbwd_consecutive_resnet.trainer.py line 52) which is 0.5, for ShanghaiTech and Avenue a smaller value of 0.2 was used. I looked in my old results and I ve seen that the alpha value has a bad influence on the final performance, but my results with alpha=1, are still better than your results.

I also re-run the evaluation script and I obtained the same results, but I spot a small difference that reduced the performance by 1% (compute_performance_scores.py uncomment line 156 and comment line 157 (this is a difference for ped2)). I do not think that this change is going to give you back the difference in performance. From my experience, what really hurt the performance for ShanghaiTech is the 3D filter, so please be sure that you did not use it, because when the 3D filter is used the performance drops exactly as in your case.

Apart from checking the 3D filter and the alpha value what I can recommend you is to download again the code to be sure that everything is in place.

Best, Lili

Hi Lili,

I restarted my experiments from scratch and by implementing the modifications you suggested and I've got significantly better results ! Still slightly below the results in the paper (frame based AUC of 0.805 and video averaged frame-based AUC of 0.889) but are good enough for the rest of my experiments. Thank you very much for your help and reactivity ! You gave me great insight on the inner workings of your code and I now have a better grasp on the experiments I will have to undergo.

Best, Olivier

Hi Ovilier,

I am glad you obtained better results. Thank you too, for having the patience to test all my suggestions.

Best of luck with your research, Lili

Hi, I recently tried to test your network on Shanghai Tech dataset but I have been unable to reproduce the results of your paper. Could you help me identify the issue ?

Here are the results I got : " Frame-based AUC is 0.767 on shanghaiTech (all data set). Avg. (on video) frame-based AUC is 0.829 on shanghaiTech. "

Compared to the 82.4% of Micro-AUC and 89.3% of Macro-AUC given on github.

And here are the parameters and applied modifications by file to comply with the readme file:
args.py:
  block_scale = 20

utils.py:
  ProcessingType.TRAIN = "training"
  ProcessingType.TEST = "testing"

train.py:
  temporal_size = 15
  extract_objects(ProcessingType.TRAIN, is_video=True)

test.py:
  temporal_size = 3
  extract_objects(ProcessingType.TEST, is_video=False)

compute_performance_score.py:
  predict_anomaly_on_frames:
      commented the 3D convolution line

  compute_performance_indices:
      filter_2d = gaussian_filter_(np.arange(1, 202), 31) 
I also downloaded YOLOV3 weights from the provided github and used a detection threshold of 0.5. And I produced the Shanghaitech dataset frame-level ground truth (ground_truth_frame_level.txt) from the shanghaitech_semantic_annotation.json file from (https://github.com/svip-lab/MLEP/tree/master/data/annotations)

Am I missing something ?

Thank you,

Sincerely yours,

Olivier

Hello, may I ask how you generate the Shanghaitech dataset frame-level ground truth (ground truth frame level.txt)? I tried the prompt you gave, but it didn't work. Could you tell me how to generate it? I would be honored to receive your reply. @olaurendin @lilygeorgescu

lilygeorgescu / AED-SSMTL

Unexpected results on Shanghai Tech #6