wei-tim / YOWO

You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization
840 stars 161 forks source link

can‘t obtain the video-mAP value reported in the paper #70

Open MiaSanLei opened 3 years ago

MiaSanLei commented 3 years ago

Inference with the best checkpoint file downloaded from the website. I can almost obtain the frame-mAP value reported in the paper, for ucf24, 80.37 vs 80.4, for jhmdb, 74.51 vs 74.4. But the video-map is about three points lower than the one reported in the paper, for jhmdb, video-mAP@0.5=82.57 vs 85.7. Could you tell me why, please?

Runnert commented 3 years ago

l can almost obtain the video-map is about four points lower than the one reported in the paper, for ucf-24, video-mAP@0.5=44.96

Riiick2011 commented 2 years ago

Can you tell me what does your jhmdb experiment's cfg look like? My frame mAP is about 59% for jhmdb. ----------jhmdb.yaml---------- TRAIN:

RESUME_PATH: "/home/su/YOWO/backup/jhmdb/yowo_jhmdb21_16f_best.pth" # "/home/su/YOWO/backup/jhmdb/yowo_jhmdb-21_16f_best.pth"

DATASET: jhmdb21 # ava, ucf24 or jhmdb21 BATCH_SIZE: 24
TOTAL_BATCH_SIZE: 128 LEARNING_RATE: 1e-4 EVALUATE: False FINE_TUNE: False BEGIN_EPOCH: 1 END_EPOCH: 10 SOLVER: MOMENTUM: 0.9 WEIGHT_DECAY: 5e-4 STEPS: [3,4,5,6] LR_DECAY_RATE: 0.5 ANCHORS: [0.95878, 3.10197, 1.67204, 4.0040, 1.75482, 5.64937, 3.09299, 5.80857, 4.91803, 6.25225] NUM_ANCHORS: 5 OBJECT_SCALE: 5 NOOBJECT_SCALE: 1 CLASS_SCALE: 1 COORD_SCALE: 1 DATA: NUM_FRAMES: 16 SAMPLING_RATE: 1 TRAIN_JITTER_SCALES: [256, 320] TRAIN_CROP_SIZE: 224 TEST_CROP_SIZE: 224 MEAN: [0.4345, 0.4051, 0.3775] STD: [0.2768, 0.2713, 0.2737] MODEL: NUM_CLASSES: 21 BACKBONE_3D: resnext101 BACKBONE_2D: darknet WEIGHTS: BACKBONE_3D: "weights/resnext-101-kinetics-hmdb51_split1.pth" BACKBONE_2D: "weights/yolo.weights" FREEZE_BACKBONE_3D: True FREEZE_BACKBONE_2D: True LISTDATA: BASE_PTH: "/data1/su/datasets/JHMDB-YOWO" TRAIN_FILE: "/data1/su/datasets/JHMDB-YOWO/trainlist.txt" TEST_FILE: "/data1/su/datasets/JHMDB-YOWO/testlist.txt" TEST_VIDEO_FILE: "/data1/su/datasets/JHMDB-YOWO/testlist_video.txt" MAX_OBJS: 1 CLASS_NAMES: [ "brush_hair", "catch", "clap", "climb_stairs", "golf", "jump", "kick_ball", "pick", "pour", "pullup", "push", "run", "shoot_ball", "shoot_bow", "shoot_gun", "sit", "stand", "swing_baseball", "throw", "walk", "wave" ] BACKUP_DIR: "backup/jhmdb" RNG_SEED: 1
NUM_GPUS: 4 VISBLE_GPUS: '"0, 1, 2, 3"' GPUS_ID: [0, 1, 2, 3]