facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Apache License 2.0
6.51k stars 1.2k forks source link

Fine Tuning AVA model on custom dataset #417

Open AmirAliVz opened 3 years ago

AmirAliVz commented 3 years ago

Hi, Thank you for sharing your great code

I'm trying to fine tune the AVA model on my custom dataset with 6 classes but I'm not sure about the format of my dataset. I named my dataset ava and tried to prepare the annotation files as mentioned in DATASET.md; however, I couldn't understand how can I provide the action labels for each frame automatically because it is a bit hard to label the action in all frames manually. However, I tried to label a small portion of my dataset manually only to check if I can make the code work but I faced the following error:

Traceback (most recent call last): File "tools/run_net.py", line 44, in main() File "tools/run_net.py", line 25, in main launch_job(cfg=cfg, init_method=args.init_method, func=train) File "/media/adminadmin/New Volume/Vaziri/SlowFast/slowfast/utils/misc.py", line 297, in launch_job func(cfg=cfg) File "/media/adminadmin/New Volume/Vaziri/SlowFast/tools/train_net.py", line 450, in train train_epoch( File "/media/adminadmin/New Volume/Vaziri/SlowFast/tools/train_net.py", line 88, in train_epoch optimizer.step() File "/usr/local/lib/python3.8/dist-packages/torch/optim/optimizer.py", line 89, in wrapper return func(*args, *kwargs) File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/torch/optim/sgd.py", line 110, in step F.sgd(params_with_grad, File "/usr/local/lib/python3.8/dist-packages/torch/optim/functional.py", line 169, in sgd buf.mul(momentum).add_(d_p, alpha=1 - dampening) RuntimeError: The size of tensor a (80) must match the size of tensor b (6) at non-singleton dimension 0

I guess the code is somehow hardcoded on the AVA dataset and its 80 classes but I don't know how can I use it on my own dataset! Also my config file is like:

TRAIN: ENABLE: True DATASET: ava BATCH_SIZE: 2 EVAL_PERIOD: 1 CHECKPOINT_PERIOD: 1 AUTO_RESUME: True CHECKPOINT_FILE_PATH: /media/adminadmin/New Volume/Vaziri/SlowFast/Models/SLOWFAST_32x2_R101_50_50.pkl CHECKPOINT_TYPE: pytorch DATA: PATH_TO_DATA_DIR: /media/adminadmin/New Volume/Vaziri/SlowFast/data/ava NUM_FRAMES: 32 SAMPLING_RATE: 1 TRAIN_JITTER_SCALES: [256, 320] TRAIN_CROP_SIZE: 224 TEST_CROP_SIZE: 256 INPUT_CHANNEL_NUM: [3, 3] DETECTION: ENABLE: True ALIGNED: False AVA: BGR: False DETECTION_SCORE_THRESH: 0.8

TEST_PREDICT_BOX_LISTS: ["person_box_67091280_iou90/ava_detection_val_boxes_and_labels.csv"]

FRAME_LIST_DIR: /media/adminadmin/New Volume/Vaziri/SlowFast/data/ava/frames TRAIN_LISTS: ["/media/adminadmin/New Volume/Vaziri/SlowFast/data/ava/frame_lists/train.csv"] TEST_LISTS: ["/media/adminadmin/New Volume/Vaziri/SlowFast/data/ava/frame_lists/val.csv"] ANNOTATION_DIR: /media/adminadmin/New Volume/Vaziri/SlowFast/data/ava/annotations SLOWFAST: ALPHA: 4 BETA_INV: 8 FUSION_CONV_CHANNEL_RATIO: 2 FUSION_KERNEL_SZ: 5 RESNET: ZERO_INIT_FINAL_BN: True WIDTH_PER_GROUP: 64 NUM_GROUPS: 1 DEPTH: 101 TRANS_FUNC: bottleneck_transform STRIDE_1X1: False NUM_BLOCK_TEMP_KERNEL: [[3, 3], [4, 4], [6, 6], [3, 3]] SPATIAL_DILATIONS: [[1, 1], [1, 1], [1, 1], [2, 2]] SPATIAL_STRIDES: [[1, 1], [2, 2], [2, 2], [1, 1]] NONLOCAL: LOCATION: [[[], []], [[], []], [[6, 13, 20], []], [[], []]] GROUP: [[1, 1], [1, 1], [1, 1], [1, 1]] INSTANTIATION: dot_product POOL: [[[2, 2, 2], [2, 2, 2]], [[2, 2, 2], [2, 2, 2]], [[2, 2, 2], [2, 2, 2]], [[2, 2, 2], [2, 2, 2]]] BN: USE_PRECISE_STATS: False NUM_BATCHES_PRECISE: 200 SOLVER: MOMENTUM: 0.9 WEIGHT_DECAY: 1e-7 OPTIMIZING_METHOD: sgd MODEL: NUM_CLASSES: 6 ARCH: slowfast MODEL_NAME: SlowFast LOSS_FUNC: bce DROPOUT_RATE: 0.5 HEAD_ACT: sigmoid TEST: ENABLE: True DATASET: ava BATCH_SIZE: 2 DATA_LOADER: NUM_WORKERS: 2 PIN_MEMORY: True NUM_GPUS: 1 NUM_SHARDS: 1 RNG_SEED: 0 OUTPUT_DIR: .

wenqiang-china commented 3 years ago

The cause of this problem lies in file slowfast/utils/checkpoint.py, in load_checkpoint function


        if "epoch" in checkpoint.keys() and not epoch_reset:
            epoch = checkpoint["epoch"]
            if optimizer:
                optimizer.load_state_dict(checkpoint["optimizer_state"])

So when 'epoch' info stores in checkpoints, and epoch_reset argument set to False, optimizer state will be loaded, and then when sgd calculate gradient d_p, old momentum buffer will be used, then dimention mismatch occurs.

To solve this problem, set TRAIN.CHECKPOINT_EPOCH_RESET to be True.