fredzzhang / upt

[CVPR'22] Official PyTorch implementation for paper "Efficient Two-Stage Detection of Human–Object Interactions with a Novel Unary–Pairwise Transformer"
https://fredzzhang.com/unary-pairwise-transformers
BSD 3-Clause "New" or "Revised" License
144 stars 26 forks source link

HICO-DET training accuracy problem #64

Closed hutuo1213 closed 1 year ago

hutuo1213 commented 1 year ago

The training settings batch=4, world_size=4; I get The mAP is 0.3098, rare: 0.2528, none-rare: 0.3268 on the hico-det dataset. When the batch is slightly greater than 16, The mAP is 0.3119, rare: 0.2539, none-rare: 0.3293. I tested the upt(resnet50) training weights you posted, The mAP is 0.3156, rare: 0.2560, none-rare: 0.3334. I want to fully achieve your training performance, please give me some advice, thank you! Did I ignore some settings? nohup: ignoring input Namespace(alpha=0.5, aux_loss=True, backbone='resnet50', batch_size=4, bbox_loss_coef=5, box_score_thresh=0.2, cache=False, clip_max_norm=0.1, data_root='./hicodet', dataset='hicodet', dec_layers=6, device='cuda', dilation=False, dim_feedforward=2048, dropout=0.1, enc_layers=6, eos_coef=0.1, epochs=20, eval=False, fg_iou_thresh=0.5, gamma=0.2, giou_loss_coef=2, hidden_dim=256, lr_drop=10, lr_head=0.0001, max_instances=15, min_instances=3, nheads=8, num_queries=100, num_workers=2, output_dir='checkpoints/upt-r50-hicodet', partitions=['train2015', 'test2015'], port='1234', position_embedding='sine', pre_norm=False, pretrained='checkpoints/detr-r50-hicodet.pth', print_interval=500, repr_dim=512, resume='', sanity=False, seed=66, set_cost_bbox=5, set_cost_class=1, set_cost_giou=2, weight_decay=0.0001, world_size=4) Load weights for the object detector from checkpoints/detr-r50-hicodet.pth => Rank 0: start from a randomly initialised model => Rank 1: start from a randomly initialised model => Rank 3: start from a randomly initialised model => Rank 2: start from a randomly initialised model

fredzzhang commented 1 year ago

Hi @yaoyaosanqi,

What you get 0.3119 is pretty close to the reported performance. There is a bit randomness in the CUDA backend, so you are not guaranteed to get the exact same results.

On the other hand, are you taking the checkpoint at the 20th epoch for evaluation? You might want to test a few checkpoints after the learning rate drops. If I recall this correctly, based on results from cross validation, we found that the model reaches the best performance around 13-14th epoch, and then performance starts fluctuating. Just test a few more checkpoints around that point, say 12-17, you might get one with higher performance.

Fred.

hutuo1213 commented 1 year ago

Thanks for the tip, very good work!