alibaba-mmai-research / TAdaConv

[ICLR 2022] TAda! Temporally-Adaptive Convolutions for Video Understanding. This codebase provides solutions for video classification, video representation learning and temporal detection.
https://tadaconv-iclr2022.github.io
Apache License 2.0
226 stars 31 forks source link

top1-5 accuracy did not achieve the expected effect(Mosi/Finetuned on UCF101/HMDB51 dataset) #18

Open TJQdoIt9527 opened 1 year ago

TJQdoIt9527 commented 1 year ago

Thanks for your great work! It is worth affirming that your work has great research value. However, when I loaded the fine-tuned checkpoint in the MOSI project (with R (2+1) D-10 as the backbone), I ran r2p1d_test.yaml, the accuracy of top1-5 is approximately 72.77% and 91.49%, respectively. I don't know where I made the mistake. I haven't made any changes to the configuration file. Do I need to add some additional configuration parameter? Or can you provide the configuration in val.log Looking forward to your reply, thank you very much!

huang-ziyuan commented 1 year ago

Hi, thanks for the interest. I will look into this problem ASAP.

TJQdoIt9527 commented 1 year ago

嗨,感谢您的关注。我会尽快调查这个问题. Thank you for your attention. Later, I tried to use your fine-tuned checkpoint to load val files on hmdb51 using the R (2+1) D backbone for verification (there are three val/train files in hmdb51/ucf101). I found that on the first val file, the top 1/5 results were all smaller than your results, while on the last two val files, the top 1/5 results were both larger than your results. I was wondering, is your result the average of these three val results? Looking forward to your reply! Thank you.

huang-ziyuan commented 1 year ago

嗨,感谢您的关注。我会尽快调查这个问题. Thank you for your attention. Later, I tried to use your fine-tuned checkpoint to load val files on hmdb51 using the R (2+1) D backbone for verification (there are three val/train files in hmdb51/ucf101). I found that on the first val file, the top 1/5 results were all smaller than your results, while on the last two val files, the top 1/5 results were both larger than your results. I was wondering, is your result the average of these three val results? Looking forward to your reply! Thank you.

The result was generated on the first split for both UCF and HMDB. We did not average the performance on all three splits.

TJQdoIt9527 commented 1 year ago

嗨,感谢您的关注。我会尽快调查这个问题. Thank you for your attention. Later, I tried to use your fine-tuned checkpoint to load val files on hmdb51 using the R (2+1) D backbone for verification (there are three val/train files in hmdb51/ucf101). I found that on the first val file, the top 1/5 results were all smaller than your results, while on the last two val files, the top 1/5 results were both larger than your results. I was wondering, is your result the average of these three val results? Looking forward to your reply! Thank you.

The result was generated on the first split for both UCF and HMDB. We did not average the performance on all three splits.

Oh, okay, that's even more confusing for me. Here's val_ 10clipsx1crops.log output (including the results of three splits):

[09/01 10:46:56][INFO] tadaconv.utils.checkpoint: 511: Load from given checkpoint file. Checkpoint file path: /home/lzh/2022/tjq/TAdaConv/checkpoint/r2p1d_pt_hmdb_ft_hmdb_5183_public.pyth [09/01 10:46:56][INFO] tadaconv.utils.checkpoint: 328: Keys in model not matched: [] [09/01 10:46:56][INFO] tadaconv.utils.checkpoint: 329: Keys in checkpoint not matched: [] [09/01 10:46:56][INFO] tadaconv.utils.checkpoint: 337: Model ema weights not loaded because no ema state stored in checkpoint. [09/01 10:46:56][INFO] tadaconv.utils.checkpoint: 338: Unmatched keys in model could due to new parameters introduced,and unmatched keys in checkpoint might be caused by removing structures from the original model.Both are normal. [09/01 10:46:56][INFO] tadaconv.datasets.base.hmdb51: 37: Reading video list from file: hmdb51_test_list.txt [09/01 10:46:56][INFO] tadaconv.datasets.base.base_dataset: 172: Loading HMDB51 dataset list for split 'test'... [09/01 10:46:56][INFO] tadaconv.datasets.base.base_dataset: 197: Dataset HMDB51 split test loaded. Length 15300. [09/01 10:46:56][INFO] test: 215: Testing model for 73 iterations [09/01 10:47:23][INFO] tadaconv.utils.logging: 89: {"cur_iter": "7", "eta": "0:00:35", "split": "test_iter", "time_diff": 0.535588} [09/01 10:47:27][INFO] tadaconv.utils.logging: 89: {"cur_iter": "14", "eta": "0:00:33", "split": "test_iter", "time_diff": 0.552698} [09/01 10:47:33][INFO] tadaconv.utils.logging: 89: {"cur_iter": "21", "eta": "0:00:28", "split": "test_iter", "time_diff": 0.528747} [09/01 10:47:38][INFO] tadaconv.utils.logging: 89: {"cur_iter": "28", "eta": "0:01:19", "split": "test_iter", "time_diff": 1.731150} [09/01 10:47:42][INFO] tadaconv.utils.logging: 89: {"cur_iter": "35", "eta": "0:00:20", "split": "test_iter", "time_diff": 0.532526} [09/01 10:47:48][INFO] tadaconv.utils.logging: 89: {"cur_iter": "42", "eta": "0:00:17", "split": "test_iter", "time_diff": 0.536757} [09/01 10:47:54][INFO] tadaconv.utils.logging: 89: {"cur_iter": "49", "eta": "0:00:13", "split": "test_iter", "time_diff": 0.524926} [09/01 10:47:59][INFO] tadaconv.utils.logging: 89: {"cur_iter": "56", "eta": "0:00:09", "split": "test_iter", "time_diff": 0.518019} [09/01 10:48:03][INFO] tadaconv.utils.logging: 89: {"cur_iter": "63", "eta": "0:00:05", "split": "test_iter", "time_diff": 0.543796} [09/01 10:48:08][INFO] tadaconv.utils.logging: 89: {"cur_iter": "70", "eta": "0:00:01", "split": "test_iter", "time_diff": 0.485820} [09/01 10:48:10][INFO] tadaconv.utils.logging: 89: {"split": "test_final", "top1_acc": "73.86", "top5_acc": "85.29"} [09/01 10:53:17][INFO] tadaconv.utils.checkpoint: 511: Load from given checkpoint file. Checkpoint file path: /home/lzh/2022/tjq/TAdaConv/checkpoint/r2p1d_pt_hmdb_ft_hmdb_5183_public.pyth [09/01 10:53:17][INFO] tadaconv.utils.checkpoint: 328: Keys in model not matched: [] [09/01 10:53:17][INFO] tadaconv.utils.checkpoint: 329: Keys in checkpoint not matched: [] [09/01 10:53:17][INFO] tadaconv.utils.checkpoint: 337: Model ema weights not loaded because no ema state stored in checkpoint. [09/01 10:53:17][INFO] tadaconv.utils.checkpoint: 338: Unmatched keys in model could due to new parameters introduced,and unmatched keys in checkpoint might be caused by removing structures from the original model.Both are normal. [09/01 10:53:17][INFO] tadaconv.datasets.base.hmdb51: 37: Reading video list from file: hmdb51_test_list.txt [09/01 10:53:17][INFO] tadaconv.datasets.base.base_dataset: 172: Loading HMDB51 dataset list for split 'test'... [09/01 10:53:17][INFO] tadaconv.datasets.base.base_dataset: 197: Dataset HMDB51 split test loaded. Length 15300. [09/01 10:53:17][INFO] test: 215: Testing model for 40 iterations [09/01 10:53:52][INFO] tadaconv.utils.logging: 89: {"cur_iter": "5", "eta": "0:00:35", "split": "test_iter", "time_diff": 0.987247} [09/01 10:53:59][INFO] tadaconv.utils.logging: 89: {"cur_iter": "10", "eta": "0:01:17", "split": "test_iter", "time_diff": 2.502493} [09/01 10:54:04][INFO] tadaconv.utils.logging: 89: {"cur_iter": "15", "eta": "0:00:26", "split": "test_iter", "time_diff": 1.014244} [09/01 10:54:13][INFO] tadaconv.utils.logging: 89: {"cur_iter": "20", "eta": "0:00:20", "split": "test_iter", "time_diff": 0.980417} [09/01 10:54:18][INFO] tadaconv.utils.logging: 89: {"cur_iter": "25", "eta": "0:00:15", "split": "test_iter", "time_diff": 0.958687} [09/01 10:54:25][INFO] tadaconv.utils.logging: 89: {"cur_iter": "30", "eta": "0:00:09", "split": "test_iter", "time_diff": 0.908797} [09/01 10:54:30][INFO] tadaconv.utils.logging: 89: {"cur_iter": "35", "eta": "0:00:05", "split": "test_iter", "time_diff": 0.897914} [09/01 10:54:37][INFO] tadaconv.utils.logging: 89: {"cur_iter": "40", "eta": "0:00:00", "split": "test_iter", "time_diff": 0.755412} [09/01 10:54:38][INFO] tadaconv.utils.logging: 89: {"split": "test_final", "top1_acc": "73.86", "top5_acc": "85.03"}** [09/01 10:59:38][INFO] tadaconv.utils.checkpoint: 511: Load from given checkpoint file. Checkpoint file path: /home/lzh/2022/tjq/TAdaConv/checkpoint/r2p1d_pt_hmdb_ft_hmdb_5183_public.pyth [09/01 10:59:38][INFO] tadaconv.utils.checkpoint: 328: Keys in model not matched: [] [09/01 10:59:38][INFO] tadaconv.utils.checkpoint: 329: Keys in checkpoint not matched: [] [09/01 10:59:38][INFO] tadaconv.utils.checkpoint: 337: Model ema weights not loaded because no ema state stored in checkpoint. [09/01 10:59:38][INFO] tadaconv.utils.checkpoint: 338: Unmatched keys in model could due to new parameters introduced,and unmatched keys in checkpoint might be caused by removing structures from the original model.Both are normal. [09/01 10:59:38][INFO] tadaconv.datasets.base.hmdb51: 37: Reading video list from file: hmdb51_test_list.txt [09/01 10:59:38][INFO] tadaconv.datasets.base.base_dataset: 172: Loading HMDB51 dataset list for split 'test'... [09/01 10:59:38][INFO] tadaconv.datasets.base.base_dataset: 197: Dataset HMDB51 split test loaded. Length 15300. [09/01 10:59:38][INFO] test: 215: Testing model for 40 iterations [09/01 11:00:13][INFO] tadaconv.utils.logging: 89: {"cur_iter": "5", "eta": "0:00:34", "split": "test_iter", "time_diff": 0.967602} [09/01 11:00:19][INFO] tadaconv.utils.logging: 89: {"cur_iter": "10", "eta": "0:00:58", "split": "test_iter", "time_diff": 1.893316} [09/01 11:00:25][INFO] tadaconv.utils.logging: 89: {"cur_iter": "15", "eta": "0:00:26", "split": "test_iter", "time_diff": 1.017440} [09/01 11:00:33][INFO] tadaconv.utils.logging: 89: {"cur_iter": "20", "eta": "0:00:35", "split": "test_iter", "time_diff": 1.672724} [09/01 11:00:38][INFO] tadaconv.utils.logging: 89: {"cur_iter": "25", "eta": "0:00:15", "split": "test_iter", "time_diff": 0.990686} [09/01 11:00:46][INFO] tadaconv.utils.logging: 89: {"cur_iter": "30", "eta": "0:00:10", "split": "test_iter", "time_diff": 0.915038} [09/01 11:00:51][INFO] tadaconv.utils.logging: 89: {"cur_iter": "35", "eta": "0:00:05", "split": "test_iter", "time_diff": 0.908114} [09/01 11:00:56][INFO] tadaconv.utils.logging: 89: {"cur_iter": "40", "eta": "0:00:00", "split": "test_iter", "time_diff": 0.777181} [09/01 11:00:57][INFO] tadaconv.utils.logging: 89: {"split": "test_final", "top1_acc": "38.89", "top5_acc": "67.25"}**

The last result is the first val split

The configuration content in val.log is as follows: [09/01 10:46:17][INFO] test: 197: Test with config: [09/01 10:46:17][INFO] test: 198: { "TASK_TYPE": "classification", "PRETRAIN": { "ENABLE": false }, "LOCALIZATION": { "ENABLE": false }, "TRAIN": { "ENABLE": false, "DATASET": "HMDB51", "BATCH_SIZE": 210, "LOG_FILE": "training_log.log", "EVAL_PERIOD": 5, "NUM_FOLDS": 30, "AUTO_RESUME": true, "CHECKPOINT_PERIOD": 10, "INIT": "", "CHECKPOINT_FILE_PATH": "/home/lzh/2022/tjq/TAdaConv/checkpoint/r2p1d_pt_hmdb_ft_hmdb_5183_public.pyth", "CHECKPOINT_TYPE": "pytorch", "CHECKPOINT_INFLATE": false, "CHECKPOINT_PRE_PROCESS": { "ENABLE": false }, "FINE_TUNE": true, "ONLY_LINEAR": false, "LR_REDUCE": false, "TRAIN_VAL_COMBINE": false, "LOSS_FUNC": "cross_entropy" }, "TEST": { "ENABLE": true, "DATASET": "HMDB51", "BATCH_SIZE": 210, "NUM_SPATIAL_CROPS": 1, "SPATIAL_CROPS": "cc", "NUM_ENSEMBLE_VIEWS": 1, "LOG_FILE": "val.log", "CHECKPOINT_FILE_PATH": "", "CHECKPOINT_TYPE": "pytorch", "AUTOMATIC_MULTI_SCALE_TEST": true }, "VISUALIZATION": { "ENABLE": false, "NAME": "", "FEATURE_MAPS": { "ENABLE": false, "BASE_OUTPUT_DIR": "" } }, "SUBMISSION": { "ENABLE": false, "SAVE_RESULTS_PATH": "test.json" }, "DATA": { "DATA_ROOT_DIR": "/data1/hmdb51/", "ANNO_DIR": "/data1/hmdb51_annotations/hmdb51/", "NUM_INPUT_FRAMES": 16, "NUM_INPUT_CHANNELS": 3, "SAMPLING_MODE": "interval_based", "SAMPLING_RATE": 4, "TRAIN_JITTER_SCALES": [ 168, 224 ], "TRAIN_CROP_SIZE": 112, "TEST_SCALE": 224, "TEST_CROP_SIZE": 112, "MEAN": [ 0.45, 0.45, 0.45 ], "STD": [ 0.225, 0.225, 0.225 ], "MULTI_LABEL": false, "ENSEMBLE_METHOD": "sum", "TARGET_FPS": 30, "MINUS_INTERVAL": false, "FPS": 30 }, "MODEL": { "NAME": "R2Plus1D", "EMA": { "ENABLE": false, "DECAY": 0.99996 } }, "VIDEO": { "BACKBONE": { "DEPTH": 10, "META_ARCH": "ResNet3D", "NUM_FILTERS": [ 64, 64, 128, 256, 512 ], "NUM_INPUT_CHANNELS": 3, "NUM_OUT_FEATURES": 512, "KERNEL_SIZE": [ [ 3, 7, 7 ], [ 3, 3, 3 ], [ 3, 3, 3 ], [ 3, 3, 3 ], [ 3, 3, 3 ] ], "DOWNSAMPLING": [ true, false, true, true, true ], "DOWNSAMPLING_TEMPORAL": [ false, false, true, true, true ], "NUM_STREAMS": 1, "EXPANSION_RATIO": 2, "BRANCH": { "NAME": "R2Plus1DBranch" }, "STEM": { "NAME": "R2Plus1DStem" }, "NONLOCAL": { "ENABLE": false, "STAGES": [ 5 ], "MASK_ENABLE": false }, "INITIALIZATION": null }, "HEAD": { "NAME": "BaseHead", "ACTIVATION": "softmax", "DROPOUT_RATE": 0.5, "NUM_CLASSES": 51 } }, "OPTIMIZER": { "ADJUST_LR": false, "BASE_LR": 0.00075, "LR_POLICY": "cosine", "MAX_EPOCH": 300, "MOMENTUM": 0.9, "WEIGHT_DECAY": "1e-3", "WARMUP_EPOCHS": 10, "WARMUP_START_LR": 7.5e-05, "OPTIM_METHOD": "adam", "DAMPENING": 0.0, "NESTEROV": true, "BIAS_DOUBLE": false, "NEW_PARAMS": [], "NEW_PARAMS_MULT": 10, "NEW_PARAMS_WD_MULT": 1, "LAYER_WISE_LR_DECAY": 1.0, "COSINE_AFTER_WARMUP": false, "COSINE_END_LR": "1e-6" }, "BN": { "WB_LOCK": false, "FREEZE": false, "WEIGHT_DECAY": 0.0, "MOMENTUM": 0.1, "EPS": "1e-3", "SYNC": false }, "DATA_LOADER": { "NUM_WORKERS": 9, "PIN_MEMORY": false, "ENABLE_MULTI_THREAD_DECODE": true, "COLLATE_FN": null }, "NUM_GPUS": 3, "SHARD_ID": 0, "NUM_SHARDS": 1, "RANDOM_SEED": 0, "OUTPUT_DIR": "output/r2p1d_mosi_ft_hmdb_test_split03", "OUTPUT_CFG_FILE": "configuration.log", "LOG_PERIOD": 10, "DIST_BACKEND": "nccl", "LOG_MODEL_INFO": true, "LOG_CONFIG_INFO": true, "OSS": { "ENABLE": false, "KEY": null, "SECRET": null, "ENDPOINT": null, "CHECKPOINT_OUTPUT_PATH": null, "SECONDARY_DATA_OSS": { "ENABLE": false, "KEY": null, "SECRET": null, "ENDPOINT": null, "BUCKETS": [ "" ] } }, "AUGMENTATION": { "COLOR_AUG": true, "BRIGHTNESS": 0.5, "CONTRAST": 0.5, "SATURATION": 0.5, "HUE": 0.25, "GRAYSCALE": 0.3, "CONSISTENT": true, "SHUFFLE": true, "GRAY_FIRST": true, "RATIO": [ 0.857142857142857, 1.1666666666666667 ], "USE_GPU": false, "MIXUP": { "ENABLE": false, "ALPHA": 0.0, "PROB": 1.0, "MODE": "batch", "SWITCH_PROB": 0.5 }, "CUTMIX": { "ENABLE": false, "ALPHA": 0.0, "MINMAX": null }, "RANDOM_ERASING": { "ENABLE": false, "PROB": 0.25, "MODE": "const", "COUNT": [ 1, 1 ], "NUM_SPLITS": 0, "AREA_RANGE": [ 0.02, 0.33 ], "MIN_ASPECT": 0.3 }, "LABEL_SMOOTHING": 0.0, "SSV2_FLIP": false }, "PAI": false, "USE_MULTISEG_VAL_DIST": false }

huang-ziyuan commented 1 year ago

It is important to note that our model is trained on the first split, so it is only valid when it is evaluated on the test set of the first split.

For the problem, we suspect that we might have unmatched test lists, and we provide ours here for your information.

TJQdoIt9527 commented 1 year ago

It is important to note that our model is trained on the first split, so it is only valid when it is evaluated on the test set of the first split.

For the problem, we suspect that we might have unmatched test lists, and we provide ours here for your information.

The accuracy (in the test file you provided) is still unsatisfactory, and the results are as follows

Checkpoint file path: /home/lzh/2022/tjq/TAdaConv/checkpoint/r2p1d_pt_hmdb_ft_hmdb_5183_public.pyth [09/05 17:01:57][INFO] tadaconv.utils.checkpoint: 328: Keys in model not matched: [] [09/05 17:01:57][INFO] tadaconv.utils.checkpoint: 329: Keys in checkpoint not matched: [] [09/05 17:01:57][INFO] tadaconv.utils.checkpoint: 337: Model ema weights not loaded because no ema state stored in checkpoint. [09/05 17:01:57][INFO] tadaconv.utils.checkpoint: 338: Unmatched keys in model could due to new parameters introduced,and unmatched keys in checkpoint might be caused by removing structures from the original model.Both are normal. [09/05 17:01:57][INFO] tadaconv.datasets.base.hmdb51: 37: Reading video list from file: hmdb51_test_list.txt [09/05 17:01:57][INFO] tadaconv.datasets.base.base_dataset: 172: Loading HMDB51 dataset list for split 'test'... [09/05 17:01:57][INFO] tadaconv.datasets.base.base_dataset: 197: Dataset HMDB51 split test loaded. Length 15300. [09/05 17:01:57][INFO] test: 215: Testing model for 40 iterations [09/05 17:02:42][INFO] tadaconv.utils.logging: 89: {"cur_iter": "5", "eta": "0:00:36", "split": "test_iter", "time_diff": 1.002031} [09/05 17:02:54][INFO] tadaconv.utils.logging: 89: {"cur_iter": "10", "eta": "0:04:17", "split": "test_iter", "time_diff": 8.309224} [09/05 17:02:59][INFO] tadaconv.utils.logging: 89: {"cur_iter": "15", "eta": "0:00:25", "split": "test_iter", "time_diff": 0.966426} [09/05 17:03:13][INFO] tadaconv.utils.logging: 89: {"cur_iter": "20", "eta": "0:00:20", "split": "test_iter", "time_diff": 0.974530} [09/05 17:03:18][INFO] tadaconv.utils.logging: 89: {"cur_iter": "25", "eta": "0:00:15", "split": "test_iter", "time_diff": 0.962881} [09/05 17:03:26][INFO] tadaconv.utils.logging: 89: {"cur_iter": "30", "eta": "0:00:10", "split": "test_iter", "time_diff": 0.929683} [09/05 17:03:31][INFO] tadaconv.utils.logging: 89: {"cur_iter": "35", "eta": "0:00:05", "split": "test_iter", "time_diff": 0.947996} [09/05 17:03:37][INFO] tadaconv.utils.logging: 89: {"cur_iter": "40", "eta": "0:00:00", "split": "test_iter", "time_diff": 0.773843} [09/05 17:03:38][INFO] tadaconv.utils.logging: 89: {"split": "test_final", "top1_acc": "40.20", "top5_acc": "69.80"}

There is a fact that I have to mention, when decord=0.4.1, my program reported the following error

Error at decoding. 7/10. Vid index: 1130, Vid path: /data/hmdb51/videos/sit/TheBoondockSaints_sit_u_cm_np1_fr_med_29.avi Traceback (most recent call last): File "/home/lzh/2022/tjq/TAdaConv/tadaconv/datasets/base/base_dataset.py", line 349, in getitem data, file_to_remove, success = self.decode( File "/home/lzh/2022/tjq/TAdaConv/tadaconv/datasets/base/base_dataset.py", line 268, in _decode_video frames = dlpack.from_dlpack(vr.getbatch(list).to_dlpack()).clone() File "/home/lzh/anaconda3/envs/tjq_tada/lib/python3.8/site-packages/decord/video_reader.py", line 163, in get_batch arr = _CAPI_VideoReaderGetBatch(self._handle, indices) File "/home/lzh/anaconda3/envs/tjq_tada/lib/python3.8/site-packages/decord/_ffi/_ctypes/function.py", line 173, in call check_call(_LIB.DECORDFuncCall( File "/home/lzh/anaconda3/envs/tjq_tada/lib/python3.8/site-packages/decord/_ffi/base.py", line 63, in check_call raise DECORDError(py_str(_LIB.DECORDGetLastError())) decord._ffi.base.DECORDError: [17:13:46] /io/decord/src/video/video_reader.cc:645: Error getting frame at: 8 with total frames: 80

Based on this(https://github.com/dmlc/decord/issues/124) link, I feel that there is a problem with the decord version。 Indeed, it runs normally when decord=0.4.0/0.6.0

The conda list is as follows: _libgcc_mutex 0.1
aliyun-python-sdk-core 2.13.36
aliyun-python-sdk-kms 2.16.1
ca-certificates 2023.05.30
certifi 2023.7.22
cffi 1.15.1
charset-normalizer 3.2.0
crcmod 1.7
cryptography 41.0.3
decord 0.4.1
einops 0.6.1
idna 3.4
jmespath 0.10.0
joblib 1.3.2
ld_impl_linux-64 2.38
libffi 3.3
libgcc-ng 9.1.0
libstdcxx-ng 9.1.0
ncurses 6.3
numpy 1.24.4
openssl 1.1.1v
oss2 2.18.1
Pillow 10.0.0
pip 23.2.1
psutil 5.9.5
pycparser 2.21
pycryptodome 3.18.0
python 3.8.13
readline 8.1.2
requests 2.31.0
setuptools 68.0.0
simplejson 3.11.1
sqlite 3.38.5
tk 8.6.12
torch 1.12.1+cu113
torchaudio 0.12.1+cu113
torchvision 0.13.1+cu113
typing_extensions 4.7.1
urllib3 2.0.4
wheel 0.38.4
xz 5.2.5
zlib 1.2.12

Is it due to the decord version? Is there anything else I haven't noticed? Looking forward to your reply. Thank you very much

huang-ziyuan commented 1 year ago

This is caused by the change of cropping function that we use. Set the cfg.DATA.TEST_SCALE to 112 and the problem would be solved.

TJQdoIt9527 commented 1 year ago

This is caused by the change of cropping function that we use. Set the cfg.DATA.TEST_SCALE to 112 and the problem would be solved.

After adding this parameter, testing in the test file you provided has indeed improved the effect, but it is still about 2 points short of your best result (using backbone r2p1d/r2d3s on hmdb51). The results are as follows:

[09/07 13:49:12][INFO] tadaconv.utils.checkpoint: 511: Load from given checkpoint file. Checkpoint file path: /home/lwd/aim_tada/TAdaConv/checkpoint/r2p1d_pt_hmdb_ft_hmdb_5183_public.pyth [09/07 13:49:12][INFO] tadaconv.utils.checkpoint: 328: Keys in model not matched: [] [09/07 13:49:12][INFO] tadaconv.utils.checkpoint: 329: Keys in checkpoint not matched: [] [09/07 13:49:12][INFO] tadaconv.utils.checkpoint: 337: Model ema weights not loaded because no ema state stored in checkpoint. [09/07 13:49:12][INFO] tadaconv.utils.checkpoint: 338: Unmatched keys in model could due to new parameters introduced,and unmatched keys in checkpoint might be caused by removing structures from the original model.Both are normal. [09/07 13:49:12][INFO] tadaconv.datasets.base.hmdb51: 37: Reading video list from file: hmdb51_test_list.txt [09/07 13:49:12][INFO] tadaconv.datasets.base.base_dataset: 172: Loading HMDB51 dataset list for split 'test'... [09/07 13:49:12][INFO] tadaconv.datasets.base.base_dataset: 197: Dataset HMDB51 split test loaded. Length 1530. [09/07 13:49:12][INFO] test: 215: Testing model for 4 iterations [09/07 13:49:33][WARNING] tadaconv.utils.meters: 147: clip count 0: 2, 1: 2 ~= num clips 1 [09/07 13:49:33][INFO] tadaconv.utils.logging: 89: {"split": "test_final", "top1_acc": "47.84", "top5_acc": "75.49"}

[09/07 13:49:39][INFO] tadaconv.datasets.base.base_dataset: 197: Dataset HMDB51 split test loaded. Length 15300. [09/07 13:49:39][INFO] test: 215: Testing model for 40 iterations [09/07 13:50:08][INFO] tadaconv.utils.logging: 89: {"cur_iter": "5", "eta": "0:00:21", "split": "test_iter", "time_diff": 0.595396} [09/07 13:50:14][INFO] tadaconv.utils.logging: 89: {"cur_iter": "10", "eta": "0:00:20", "split": "test_iter", "time_diff": 0.647939} [09/07 13:50:17][INFO] tadaconv.utils.logging: 89: {"cur_iter": "15", "eta": "0:00:17", "split": "test_iter", "time_diff": 0.656485} [09/07 13:50:23][INFO] tadaconv.utils.logging: 89: {"cur_iter": "20", "eta": "0:00:13", "split": "test_iter", "time_diff": 0.644233} [09/07 13:50:33][INFO] tadaconv.utils.logging: 89: {"cur_iter": "25", "eta": "0:01:46", "split": "test_iter", "time_diff": 6.626712} [09/07 13:50:36][INFO] tadaconv.utils.logging: 89: {"cur_iter": "30", "eta": "0:00:06", "split": "test_iter", "time_diff": 0.614449} [09/07 13:50:41][INFO] tadaconv.utils.logging: 89: {"cur_iter": "35", "eta": "0:00:03", "split": "test_iter", "time_diff": 0.557151} [09/07 13:50:44][INFO] tadaconv.utils.logging: 89: {"cur_iter": "40", "eta": "0:00:00", "split": "test_iter", "time_diff": 0.467891} [09/07 13:50:44][INFO] tadaconv.utils.logging: 89: {"split": "test_final", "top1_acc": "50.13", "top5_acc": "76.80"}

The configuration file is as follows: [09/07 13:49:10][INFO] test: 197: Test with config: [09/07 13:49:10][INFO] test: 198: { "TASK_TYPE": "classification", "PRETRAIN": { "ENABLE": false }, "LOCALIZATION": { "ENABLE": false }, "TRAIN": { "ENABLE": false, "DATASET": "HMDB51", "BATCH_SIZE": 210, "LOG_FILE": "training_log.log", "EVAL_PERIOD": 5, "NUM_FOLDS": 30, "AUTO_RESUME": true, "CHECKPOINT_PERIOD": 10, "INIT": "", "CHECKPOINT_FILE_PATH": "/home/lwd/aim_tada/TAdaConv/checkpoint/r2p1d_pt_hmdb_ft_hmdb_5183_public.pyth", "CHECKPOINT_TYPE": "pytorch", "CHECKPOINT_INFLATE": false, "CHECKPOINT_PRE_PROCESS": { "ENABLE": false }, "FINE_TUNE": true, "ONLY_LINEAR": false, "LR_REDUCE": false, "TRAIN_VAL_COMBINE": false, "LOSS_FUNC": "cross_entropy" }, "TEST": { "ENABLE": true, "DATASET": "HMDB51", "BATCH_SIZE": 384, "NUM_SPATIAL_CROPS": 1, "SPATIAL_CROPS": "cc", "NUM_ENSEMBLE_VIEWS": 1, "LOG_FILE": "val.log", "CHECKPOINT_FILE_PATH": "", "CHECKPOINT_TYPE": "pytorch", "AUTOMATIC_MULTI_SCALE_TEST": true }, "VISUALIZATION": { "ENABLE": false, "NAME": "", "FEATURE_MAPS": { "ENABLE": false, "BASE_OUTPUT_DIR": "" } }, "SUBMISSION": { "ENABLE": false, "SAVE_RESULTS_PATH": "test.json" }, "DATA": { "DATA_ROOT_DIR": "/data1/hmdb51/", "ANNO_DIR": "/data1/hmdb51_annotations/hmdb51/", "NUM_INPUT_FRAMES": 16, "NUM_INPUT_CHANNELS": 3, "SAMPLING_MODE": "interval_based", "SAMPLING_RATE": 4, "TRAIN_JITTER_SCALES": [ 168, 224 ], "TRAIN_CROP_SIZE": 112, "TEST_SCALE": 112, "TEST_CROP_SIZE": 112, "MEAN": [ 0.45, 0.45, 0.45 ], "STD": [ 0.225, 0.225, 0.225 ], "MULTI_LABEL": false, "ENSEMBLE_METHOD": "sum", "TARGET_FPS": 30, "MINUS_INTERVAL": false, "FPS": 30 }, "MODEL": { "NAME": "R2Plus1D", "EMA": { "ENABLE": false, "DECAY": 0.99996 } }, "VIDEO": { "BACKBONE": { "DEPTH": 10, "META_ARCH": "ResNet3D", "NUM_FILTERS": [ 64, 64, 128, 256, 512 ], "NUM_INPUT_CHANNELS": 3, "NUM_OUT_FEATURES": 512, "KERNEL_SIZE": [ [ 3, 7, 7 ], [ 3, 3, 3 ], [ 3, 3, 3 ], [ 3, 3, 3 ], [ 3, 3, 3 ] ], "DOWNSAMPLING": [ true, false, true, true, true ], "DOWNSAMPLING_TEMPORAL": [ false, false, true, true, true ], "NUM_STREAMS": 1, "EXPANSION_RATIO": 2, "BRANCH": { "NAME": "R2Plus1DBranch" }, "STEM": { "NAME": "R2Plus1DStem" }, "NONLOCAL": { "ENABLE": false, "STAGES": [ 5 ], "MASK_ENABLE": false }, "INITIALIZATION": null }, "HEAD": { "NAME": "BaseHead", "ACTIVATION": "softmax", "DROPOUT_RATE": 0.5, "NUM_CLASSES": 51 } }, "OPTIMIZER": { "ADJUST_LR": false, "BASE_LR": 0.00075, "LR_POLICY": "cosine", "MAX_EPOCH": 300, "MOMENTUM": 0.9, "WEIGHT_DECAY": "1e-3", "WARMUP_EPOCHS": 10, "WARMUP_START_LR": 7.5e-05, "OPTIM_METHOD": "adam", "DAMPENING": 0.0, "NESTEROV": true, "BIAS_DOUBLE": false, "NEW_PARAMS": [], "NEW_PARAMS_MULT": 10, "NEW_PARAMS_WD_MULT": 1, "LAYER_WISE_LR_DECAY": 1.0, "COSINE_AFTER_WARMUP": false, "COSINE_END_LR": "1e-6" }, "BN": { "WB_LOCK": false, "FREEZE": false, "WEIGHT_DECAY": 0.0, "MOMENTUM": 0.1, "EPS": "1e-3", "SYNC": false }, "DATA_LOADER": { "NUM_WORKERS": 8, "PIN_MEMORY": false, "ENABLE_MULTI_THREAD_DECODE": true, "COLLATE_FN": null }, "NUM_GPUS": 4, "SHARD_ID": 0, "NUM_SHARDS": 1, "RANDOM_SEED": 0, "OUTPUT_DIR": "output/r2p1d_mosi_ft_hmdb_test_split66", "OUTPUT_CFG_FILE": "configuration.log", "LOG_PERIOD": 10, "DIST_BACKEND": "nccl", "LOG_MODEL_INFO": true, "LOG_CONFIG_INFO": true, "OSS": { "ENABLE": false, "KEY": null, "SECRET": null, "ENDPOINT": null, "CHECKPOINT_OUTPUT_PATH": null, "SECONDARY_DATA_OSS": { "ENABLE": false, "KEY": null, "SECRET": null, "ENDPOINT": null, "BUCKETS": [ "" ] } }, "AUGMENTATION": { "COLOR_AUG": true, "BRIGHTNESS": 0.5, "CONTRAST": 0.5, "SATURATION": 0.5, "HUE": 0.25, "GRAYSCALE": 0.3, "CONSISTENT": true, "SHUFFLE": true, "GRAY_FIRST": true, "RATIO": [ 0.857142857142857, 1.1666666666666667 ], "USE_GPU": false, "MIXUP": { "ENABLE": false, "ALPHA": 0.0, "PROB": 1.0, "MODE": "batch", "SWITCH_PROB": 0.5 }, "CUTMIX": { "ENABLE": false, "ALPHA": 0.0, "MINMAX": null }, "RANDOM_ERASING": { "ENABLE": false, "PROB": 0.25, "MODE": "const", "COUNT": [ 1, 1 ], "NUM_SPLITS": 0, "AREA_RANGE": [ 0.02, 0.33 ], "MIN_ASPECT": 0.3 }, "LABEL_SMOOTHING": 0.0, "SSV2_FLIP": false, "COLOR_P": 0.0, "AUTOAUGMENT": { "ENABLE": true, "BEFORE_CROP": true, "TYPE": "rand-m9-n4-mstd0.5-inc1" } }, "PAI": false, "USE_MULTISEG_VAL_DIST": false }

Are there any other parameters that I need to pay attention to? For example, AUTOAUGMENT? Looking forward to your reply, thank you very much

huang-ziyuan commented 1 year ago

I do not see a problem in your current config. Autoaugment only affects the training process. We have run on our end that we could achieve the exact 51.83 using the checkpoint on HMDB51. Are you using our list already?

huang-ziyuan commented 1 year ago

On our side, we have reproduced a result similar to yours using decord==0.6.0. We used decord==0.4.0 for producing 51.83 on HMDB51.

TJQdoIt9527 commented 1 year ago

On our side, we have reproduced a result similar to yours using decord==0.6.0. We used decord==0.4.0 for producing 51.83 on HMDB51.

Yes, I am using the test list you provided. You are right. The results I obtained above were at decord=0.6.0. When I converted decord=0.4.0, my results were as follows: [09/07 15:25:04][INFO] tadaconv.utils.logging: 89: {"split": "test_final", "top1_acc": "51.76", "top5_acc": "78.04"}

It is very close to yours result. This makes me extremely excited, and I am also very grateful for your patient guidance!

However, I attempted to test using the r2d3ds backbone on the hmdb51 dataset, but the results still differ by two points from yours. (I added TEST SCALE: 112 and decord=0.4.0 and the test list you provided), and the results are as follows:

Checkpoint file path: /home/lwd/aim_tada/TAdaConv/checkpoint/r2d3ds_pt_hmdb_ft_hmdb_4693_public.pyth [09/07 16:26:19][INFO] tadaconv.utils.checkpoint: 328: Keys in model not matched: [] [09/07 16:26:19][INFO] tadaconv.utils.checkpoint: 329: Keys in checkpoint not matched: [] [09/07 16:26:19][INFO] tadaconv.utils.checkpoint: 337: Model ema weights not loaded because no ema state stored in checkpoint. [09/07 16:26:19][INFO] tadaconv.utils.checkpoint: 338: Unmatched keys in model could due to new parameters introduced,and unmatched keys in checkpoint might be caused by removing structures from the original model.Both are normal. [09/07 16:26:19][INFO] tadaconv.datasets.base.hmdb51: 37: Reading video list from file: hmdb51_test_list.txt [09/07 16:26:19][INFO] tadaconv.datasets.base.base_dataset: 172: Loading HMDB51 dataset list for split 'test'... [09/07 16:26:19][INFO] tadaconv.datasets.base.base_dataset: 197: Dataset HMDB51 split test loaded. Length 15300. [09/07 16:26:19][INFO] test: 215: Testing model for 20 iterations [09/07 16:27:05][INFO] tadaconv.utils.logging: 89: {"cur_iter": "5", "eta": "0:00:12", "split": "test_iter", "time_diff": 0.810198} [09/07 16:27:23][INFO] tadaconv.utils.logging: 89: {"cur_iter": "10", "eta": "0:02:44", "split": "test_iter", "time_diff": 14.951716} [09/07 16:27:25][INFO] tadaconv.utils.logging: 89: {"cur_iter": "15", "eta": "0:00:03", "split": "test_iter", "time_diff": 0.504766} [09/07 16:27:44][INFO] tadaconv.utils.logging: 89: {"cur_iter": "20", "eta": "0:00:00", "split": "test_iter", "time_diff": 0.182867} [09/07 16:27:45][INFO] tadaconv.utils.logging: 89: {"split": "test_final", "top1_acc": "44.64", "top5_acc": "72.55"}

May I ask if this result is normal, or what additional parameters do I need to add for this backbone network

huang-ziyuan commented 12 months ago

Could you try using the code of this version to reproduce the result and see whether it is normal?